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Salivary mRNA Profiling, biomarkers and 
related methods and kits of parts 

[0001] This invention was made with Government support of grant U01- 
5 DE15018 awarded by the NIH. The Government has certain rights on this 
invention 

Field of the disclosure 

[0002] The present disclosure relates to profiling of biomarkers and to 
method and kits using said biomarkers. In particular, the present disclosure 
10 related to biomarkers for detection of cancer and in particular of Oral Cavity 
and Oropharyngeal squamous Cell Carcinoma (OSCC). 

Background of the disclosure 

[0003] Biomarkers are molecular indicators of a specific biological property, 
a biochemical feature or facet that can be used to measure the progress of 
15 disease or the effects of treatment. 

[0004] Proteins and nucleic acids are exemplary biomarkers. In particular, it 
has been widely accepted that genomic messengers detected extracellularly 
can serve as biomarkers for diseases [6]. In particular, nucleic acids have 
been identified in most bodily fluids including blood, urine and cerebrospinal 
20 fluid, and have been successfully adopted for using as diagnostic biomarkers 
for diseases [28, 42, 49]. , 

[0005] Saliva is not a passive "ultrafiltrate" of serum [41], but contains a 
distinctive composition of enzymes, hormones, antibodies, and other 
molecules. In the past 10 years, the use of saliva as a diagnostic fluid has 
25 been successfully applied in diagnostics and predicting populations at risk for 
a variety of conditions [47]. 

[0006] Specific and informative biomarkers in saliva are desirable to serve 
for diagnosing disease and monitoring human health [30, 47, 6]. For example 
biomarkers have been identified in saliva for monitoring caries, periodontitis, 
30 oral cancer, salivary gland diseases, and systemic disorders, e.g., hepatitis 
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identified in saliva and used for oral cancer detection [30, 36]. RNA is more 
labile than DNA and is presumed to be highly susceptible to degradation by 
RNases. Furthermore, RNase activity, is reported to be elevated in saliva, 
5 which constitutes an inexpensive, non-invasive and accessible bodily fluid 
suitable to act as an ideal diagnostic medium. In particular, RNAase activity is 
reported to be elevated in saliva of cancer patients [83]. It has, thus, been 
commonly presumed that human mRNA could not survive extracellularly in 
saliva. OSCC is the sixth most common cancer in the world, and affects 
10 50,000 Americans annually. Worldwide, cancers of the oral cavity and 
oropharynx represent a great public health problem. OSCC accounts for 
nearly 50% of all newly diagnosed cancers in India and is a leading cause of 
death in France [1]. 

[0007] Despite improvements in locoregional control, morbidity and mortality 
15 rates have improved little in the past 30 years [2]. Therefore, early detection 
or prevention of this disease is likely to be most effective. Detecting OSCC at 
an early stage is believed to be the most effective means to reduce death and 
disfigurement from this disease. The absence of definite early warning signs 
for most head and neck cancers suggests that sensitive and specific 
20 biomarkers are likely to be important in screening high risk patients. 

Summary of the disclosure 

[0008] According to a first aspect, a method to detect a biomarker in a bodily 
fluid including a cell phase and a fluid phase, wherein the biomarker is an 
extracellular mRNA and bodily fluid is saliva, preferably unstimulated saliva, is 
25 disclosed. The method comprises: providing a cell-free fluid phase portion of 
the bodily fluid; and detecting the extracellular mRNA in the cell-free fluid 
phase portion of the bodily fluid. 

[0009] In particular, detecting the extracellular mRNA can comprise: 
isolating the extracellular mRNA from the cell-free fluid phase portion of the 
30 bodily fluid, and amplifying the extracellular mRNA. 
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fluid, including a cell phase and a fluid phase, wherein the bodily fluid is 
saliva, is disclosed. The method comprises: providing a cell-free fluid phase 
portion of the bodily fluid; and detecting a transcriptorne pattern in the cell-free 
5 fluid phase portion of the bodily fluid. The bodily fluid is preferably 
unstimulated saliva. 

[0011] In particular, detecting transcriptorne pattern in the saliva supernatant 
is preferably performed by microarray assay, most preferably by high-density 
oligonucleotide microarray assay. Detecting transcriptorne pattern in the 
10 saliva supernatant can also performed by quantitative PCR analysis or RT- 
PCR analysis. 

[0012] According to a third aspect, a method to detect genetic alterations in 
an organ by analyzing a bodily fluid draining from the organ and including a 
cell phase and a fluid phase, is disclosed. The bodily fluid is in particular 

15 saliva, preferably unstimulated saliva and method comprises: providing cell- 
free fluid phase portion of the bodily fluid; detecting a, transcriptorne pattern in 
the cell-free fluid phase portion of the bodily fluid; and comparing the 
transcriptorne pattern with a predetermined pattern, the predetermined pattern 
being indicative of a common transcriptorne pattern of normal cell-free fluid 

20 phase portion of the bodily fluid. 

[0013] According to a fourth aspect, a method to detect genetic alteration of 
a gene in an organ by analyzing a bodily fluid draining from the organ and 
including a cell phase and a fluid phase, is disclosed. The bodily fluid is in 
particular saliva and the method comprises: providing a cell-free fluid phase 
25 portion of the bodily fluid; detecting an mRNA profile of the gene in the cell- 
free fluid phase portion of the bodily fluid; and comparing the mRNA profile of 
the gene with a predetermined mRNA profile of the gene, the predetermined 
mRNA profile of the gene being indicative of the mRNA profile of the gene in 
normal cell-free fluid phase portion of the bodily fluid,. 

30 [0014] According to a fifth aspect, a method to diagnose an oral or systemic 
pathology disease or disorder in a subject, is disclosed. The method 
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detecting in the provided cell-free saliva fluid phase portion an mRNA profile 
of a gene associated with the pathology, disease or disorder; and comparing 
the RNA profile of the gene with a predetermined mRNA profile of the gene, 
5 the predetermined mRNA profile of the gene being indicative of the presence 
of the pathology, disease, or disorder in the subject. 

[0015] In a first embodiment the pathology, disease or disorder is a cancer 
of the oral cavity and/or of oropharynx, the bodily fluid is saliva and the gene 
is selected from the group consisting of the gene coding for IL8 (Interleukin 8), 
10 IL1B (Interleukin 1, beta), DUSP1 (Dual specificity phosphatase 1), H3F3A 
(H3 histone, family 3A), OAZ1 (Ornithine decarboxylase antizyme 1), S100P 
(S100 calcium binding protein P) and SAT (Spermidine/spermine N1- 
acetyltransferase). 

[0016] In a second embodiment the pathology, disease or disorder is a 
15 cancer of the oral cavity and/or of oropharynx, the bodily fluid is blood serum 
and the gene is selected IL6 (interleukin 6), H3F3A, TPT1 (Tumor protein 
trnslationally controlled 1), FTH1 (Ferritin heavy polypeptide 1), NCOA4 
(Nuclear receptor coactivator 4) and ARCR (Ras homolog gene family, 
member A). 

20 [0017] Diseases that can be diagnosed include oropharyngeal squamous 
cell carcinoma and possibly other systemic diseases. 

[0018] According to a sixth aspect, a method to diagnose an oral or 
systemic pathology, disease or disorder in a subject is disclosed. The method 
comprises: providing a cell-free fluid phase portion of the saliva of the subject; 
25 detecting in the provided cell-free fluid phase portion a transcriptome pattern 
associated with the pathology, disease or disorder; and comparing the 
transcriptome pattern with a predetermined pattern, recognition in the 
transcriptome pattern of characteristics of the predetermined pattern being 
diagnostic for the pathology, disease or disorder in the subject. 
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the oral cavity and/or of oropharynx, and transcriptome include transcript is 
selected from the group consisting of transcripts for IL8, IL1B, DUSP1, 
H3F3A, OAZ1, S100P, SAT from saliva. 

5 [0020] According to a seventh aspect, a method to diagnose an oral or 
systemic pathology, disease or disorder in a subject is disclosed, the method 
comprising: providing serum of the subject; detecting in the provided serum a 
transcriptome pattern associated with the pathology, disease or disorder; and 
comparing the transcriptome pattern with a predetermined pattern, recognition 
10 in the transcriptome pattern of characteristics of the predetermined pattern 
being diagnostic for the pathology, disease or disorder in the subject. 

[0021] In an embodiment, the pathology, disease or disorder is a cancer of 
the oral cavity and/or of oropharynx, and transcriptome include transcript is 
selected from the group consisting of transcripts for IL6, H3F3A, TPT1, FTH1, 
15 NCOA4 and ARCR from serum. 

[0022] Diseases that can be diagnosed include oropharyngeal squamous 
cell carcinoma possibly other systemic diseases. 

[0023] According to a eight aspect, a method for diagnosing a cancer, in -a 
subject is disclosed. The method comprises: providing a bodily fluid of the 
20 subject; detecting in the bodily fluid a profile of a biomarker, comparing the 
profile of the biomarker with a predetermined profile of the biomarker, 
recognition in the profile of the biomarker of characteristics of the 
predetermined profile of the biomarker being diagnostic for the cancer. 

[0024] Pathologies, diseases or disorders that can be diagnosed include 
25 oropharyngeal squamous cell carcinoma and possibly other systemic 
diseases. Biomarkers include IL8, IL1B, DUSP1, H3F3A, OAZ1, S100P, SAT, 
IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR. 

[0025] In a first embodiment, the pathology, disease or disorder is 
oropharyngeal squamous cell carcinoma, the biomarker is selected from the 
30 group consisting of IL8 IL1B, DUSP1, H3F3A, OAZ1, S100P, SAT, the bodily 
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the mRNA profile of the biomarker. 

[0026] In a second embodiment, the pathology, disease or disorder is 
oropharyngeal squamous cell carcinoma, the biomarker is selected from the 
5 group consisting of IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR the bodily 
fluid is serum and detecting a profile of a biomarker is performed by detecting 
the mRNA profile of the biomarker. 

[0027] In a third embodiment, the pathology, disease or disorder is 
oropharyngeal squamous cell carcinoma, the biomarker is IL6, the bodily fluid 
10 is blood serum and detecting a profile of a biomarker is performed by 
detecting the protein profile of the biomarker 

[0028] According to an eighth aspect, a kit for the diagnosis of an oral 
and/or systemic pathology, disease or disorder is disclosed, the kit 
comprising: an identifier of at least one biomarker in a bodily fluid, the 
15 biomarker selected from the group consisting of IL8, IL1B, DUSP1, H3F3A, 
OAZ1, S100P, SAT, IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR; and a 
detector for the identifier. 

[0029] Pathologies, diseases or disorders that can be diagnosed include 
oropharyngeal squamous cell carcinoma, and possibly the other systemic 
20 diseases. 

[0030] The identifier and the detector are to be used in detecting the bodily 
fluid profile of the biomarker according to the methods herein disclosed. In 
particular, the identifier is associated to the biomarker in the bodily fluid, and 
the detector is used to detect the identifier, the identifier and the detector 
25 thereby enables the detection of the bodily fluid profile of the biomarker. 

[0031] According to a ninth aspect, a method to diagnose an oral and/or 
systemic pathology disease or disorder, is disclosed. The method comprising: 
using salivary and/or serum mRNAs as biomarkers for oral and/or systemic 
pathology, disease or disorder. 
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biomarker selected from the group consisting of IL8, IL1B, DUSP1 , H3F3A, 
OAZ1, S100P, SAT, IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR. 

[0033] Diseases that can be diagnosed include oropharyngeal squamous 
5 cell carcinoma, and possibly other systemic diseases. 

[0034] According to a tenth aspect, a method to diagnose an oral and/or 
system pathology, is disclosed. The method comprising: using salivary or 
serum proteins as biomarkers for oral and/or systemic pathology, disease or 
disorder, in particular IL6 protein in serum and IL8 protein in saliva. 

10 [0035] The methods and kits of the disclosure will be exemplified with the 
aid of the enclosed figures. 

Description of the figures 

[0036] Figure 1A shows results of a RT-PCR typing for ACTB performed on 
RNA isolated from cell-free saliva supernatant from human beings after 
15 storage for 1 month (lane 2), 3 months (lane 3) and' 6 months (lane 4), with a 
100bp ladder molecular weight marker (lane 1) and a negative control 
(omitting templates) (lane 5). A molecular size marker is indicated on the left 
side of the Figure by arrows. 

[0037] Figure 1B shows results of a RT-PCR performed on RNA isolated 
20 from cell-free saliva supernatant from human beings (lane 1) and typing 
GAPDH (B1), RPS9 (B2) and ACTB (B3), with positive control (human total 
RNA, BD Biosciences Clontech, Palo Alto, CA, USA) (lane 2) and negative 
controls (omitting templates) (lane 3), A molecular size marker is indicated on 
the left side of the Figure by arrows. 

25 [0038] Figure 2A shows results of a capillary electrophoresis performed to 
monitor RNA amplification from RNA isolated from cell-free saliva supernatant 
from human beings. Lanes 1 to 5 show 1kb DNA ladder (lane 1), 5|jl saliva 
after RNA isolation (undetectable) (lane 2), 1pl two round amplified cRNA 
(range from 200 bp to ~4kb) (lane 3), 1pl cRNA after fragmentation (around 
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marker is indicated on the left side and right side of the Figure by arrows. 

[0039] Figure 2B shows results of a PCR performed on RNA isolated from 
cell-free saliva supernatant from human beings at various stage of 
5 amplification and typing for ACTB. Lane 1 to 8 shows 100bp DNA ladder (lane 
1), total RNA isolated from cell-free saliva (lane 2), 1st round cDNA (lane 3), 
1st round cRNA after RT (lane 4), 2nd round cDNA (lane 5), 2nd round cRNA 
after RT (lane 6), positive control (human total RNA, BD Biosciences 
Clontech, Palo Alto, CA, USA) (lane 7) and negative control (omitting 
10 templates) (lane 8). A molecular size marker is indicated on the left side of the 
Figure by arrows. 

[0040] Figure 2C shows a diagram reporting results of the analysis of target 
cRNA performed by Agilent 2100 bioanalyzer before hybridization on 
microarray. On x axis, the molecular weight (bp) of the fragmented cRNA with 
15 reference to the marker RNA, is indicated. On y axis, the quantity of the 
fragmented cRNA (ug/ml) measurable by a Bioanalyzer, is indicated. 

[0041] Figure 3 shows results of a RT-PCR performed on RNA isolated from 
cell-free saliva supernatant from human beings (saliva) together with a ladder 
(Mrkr) positive controls (Ctrl(+)) and negative controls (Ctrl(-)) and typing for 
20 IL6 (IL6), IL8 (IL8) and (3-Actin ((3-Actin). 

[0042] Figure 4 shows results of a PCR performed for the housekeeping p- 
actin on whole saliva, serum samples, and samples that had been centrifuged 
at 0 xg (0 xg), 1,000xg (1,000xg), 2,600 xg (2,600 xg), 5,000 xg (5,000 xg) 
and 10,000xg (10,000 xg) using genomic DNA as marker (Mrkr) for cell lysis 
25 and spillage of intracellular compounds. 

[0043] Figure 5A shows a diagram reporting the mean concentrations of 
mRNA for IL8 detected in replicate samples by qRT-PCR in saliva from 
patients with OSCC (Cancer) and normal subjects (Control). On x axis the 
sample groups are reported. On y axis the number of copies detected is 
30 reported . 



8 



WO 2005/081867 



PCT/US2005/005263 



[0044] Hgure bb shows a diagram reporting tne mean concentrations ot ils 
detected in replicate samples by ELISA in saliva from patients with OSCC 
(Cancer) and normal subjects (Control). On x axis the sample groups are 
reported. On y axis the concentration expressed in pg/ml, is reported. 

5 [0045] Figure 6A shows a diagram reporting the mean concentrations of 
mRNA for IL6 detected in replicate samples by qRT-PCR in serum from 
patients with OSCC (Cancer) and normal subjects (Control). On x axis the 
sample groups are reported. On y axis the number of copies detected is 
reported. 

10 [0046] Figure 6B shows a diagram reporting the mean concentrations of IL6 
detected in replicate samples by ELISA in serum from patients with OSCC 
(Cancer) and normal subjects (Control). On x axis, the sample groups are 
reported. On y axis the concentration expressed in pg/ml, is reported 

[0047] Figure 7A shows a diagram reporting the Receiver Operating 
15 Characteristic (ROC) curve calculated for IL8 in Saliva. On the x axis 1- 
specificity is reported. On y axis the, sensitivity is reported. 

[0048] Figure 7B shows a diagram reporting the ROC curve calculated for 
IL6 in serum. On the x axis 1 -specificity is reported. On y axis the sensitivity is 
reported. 

20 [0049] Figure 7C shows a diagram reporting the ROC curve calculated for a 
combination of IL8 in saliva and IL6 in serum. On the x axis 1 -specif icity is 
reported. On y axis the sensitivity is reported. 

[0050] Figure 8 shows results of a PGR reaction performed on serum 
human mRNA phenotyping of salivary mRNAs for RPS9 (Lane 2, 3 and 4); 
25 GAPDH (Lane 5, 6 and 7); B2M (Lane 8, 9 and 10) and ACTB (Lane 11,12 
and 13), together with DNA ladder, as a control (Lane 1). 

[0051] Figure 9 shows a diagram reporting a ROC curve of the logistic 
regression model for the circulating mRNA in serum. On the x axis 1- 
specif icity is reported. On y axis the sensitivity is reported. 
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regression trees (CART) model assessing the serum mRNA predictors for 
OSCC. 

[0053] Figure 11 shows a diagram reporting a ROC curve of the logistic 
5 regression model for the predictive power of combined salivary mRNA 
biomarkers. On the x axis 1 -specif icity is reported. On y axis the sensitivity is 
reported. 

[0054] Figure 12 shows a diagram reporting the classification and 
regression trees (CART) model assessing the salivary mRNA predictors for 
10 OSCC. 

Detailed description of the preferred embodiments 

[0055] A method to detect an extracellular mRNA in a bodily fluid, is 
disclosed wherein the bodily fluid is saliva and the extracellular mRNA is 
detected in a cell-free fluid phase portion of saliva. Presence of RNAs in the 
15 cell-free fluid phase portion of saliva was confirmed by the procedures 
extensively described in the Examples, the quality of the detected mRNA 
meeting the demand for techniques such as PCR, qPCR, and microarray 
assays. 

[0056] In the method, detecting extracellular mRNAs herein also informative 
20 mRNAs, is performed in a bodily fluid, saliva, that meets the demands of an 
inexpensive, non-invasive and accessible bodily fluid to act as an ideal 
medium for investigative analysis. 

[0057] Detecting informative mRNAs is in particular performed in a portion 
of saliva (cell-free fluid phase) wherein presence of microorganisms and the 
25 extraneous substances such as food debris is minimized, which allows 
analyzing the molecules in simple and accurate fashion. Preferably, the cell- 
free fluid phase portion of derived from unstimulated saliva. 

[0058] In the method, the saliva can be collected according to procedures 
known in the art and then processed to derive the cell-free fluid phase thereof, 
30 for example by centrifugation of the collected saliva, which results in a 
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procedures extensively described in Examples 1 , 5 and 13) 

[0059] According to the present disclosure, the conditions for separating the 
cell-phase and the fluid phase of saliva are optimized to avoid mechanical 
5 rupture of cellular elements which would contribute to the RNA detected in the 
fluid cell-free phase. 

[0060] In embodiments wherein the separation is performed by 
centrifugation, optimization can be performed by testing housekeeping genes 
on samples centrifuged at various speed and on whole saliva samples, using 
10 DNA as a marker of cell lysis and spillage, to derive the optimized 
centrifugation speed. (See procedure described in Example 5). 

[0061] Detection of the extracellular mRNA in the cell-free saliva fluid phase 
portion (salivary mRNA) can then be performed by techniques known in the 
art allowing mRNA qualitative and/or a quantitative analysis, such as RT- 
15 PGR, Q-PCR and Microarray. The detection can in particular be performed 
according to procedures that can include isolation and an amplification of the 
salivary mRNA and that are exemplified in the Examples. 

[0062] Detection of the salivary mRNA in the method can be performed for 
the purpose of profiling the salivary mRNA. 

20 [0063] In a first series of embodiments, the expression of predetermined 
genes, can be profiled in a cell-free fluid phase portion of saliva. In those 
embodiments, detection of the mRNA profile can be performed by RT-PCR or 
any techniques allowing identification of a predetermined target mRNA. 
Quantitative analysis can then be performed with techniques such as 

25 Quantitative PGR (Q-PCR) to confirm the presence of mRNA identified by the 
RT-PCR. A reference database can then be generated based on the mRNA 
profiles so obtained. Exemplary procedures to perform such qualitative and 
quantitative analyses of salivary mRNA are described in details in Examples 
1 , 4 and 9. 
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saliva can be performed by detecting a transcriptome pattern in the cell-free 
fluid phase portion of saliva. Detection of the transcriptome pattern can be 
performed by isolating and linearly amplifying salivary mRNA, which can then 
5 be profiled with techniques such as high-density oligonucleotide microarrays. 
Quantitative analysis can then be performed with techniques such as Q-PCR 
to confirm the presence of mRNA in the pattern identified by the microarray. A 
reference database can then be generated based on the mRNA profiles so 
obtained. Exemplary procedures to perform such qualitative and quantitative 
10 analyses of salivary mRNA are described in details in Examples 2-3, 9-1 0 and 
14-15. 

[0065] Profiling salivary RNA can be performed to detect and/or monitor 
human health and disease or to investigate biological questions, such as for 
example, the origin, release and clearance of mRNA in saliva. The salivary 
15 mRNA provides actual or potential biomarkers to identify populations and 
patients at high risk for oral and systemic pathologies, diseases or disorders. 

[0066] Alterations of the salivary mRNA profiles and transcriptome patterns 
characterizing the cell-free fluid phase portion of saliva or normal subjects can 
be indicative of pathologies, diseases or disorders of various origin. Examples 
20 of those pathologies, diseases or disorders are provided by the inflammatory 
conditions of the oral cavity, OSCC or other conditions such as diabetes, 
breast cancer and HIV. 

[0067] Also comparison between the mRNA profiles and transcriptome 
patterns of subject affected with a determined pathology, disease or disorder, 
25 can result in the identification of informative biomarkers for the determined 
pathology disease or disorder. In particular, salivary mRNA can be used as 
diagnostic biomarkers for oral and systemic pathologies, diseases or 
disorders that may be manifested in the oral cavity. 

[0068] In particular, salivary mRNA can be used as diagnostic biomarkers 
30 for cancer that may be manifested and/or affect the oral cavity. Saliva-based 
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diagnostics. 

[0069] In case of various forms of cancer, alterations of the normal salivary 
mRNA and transcriptome patterns can also reflect the genetic alterations in 

5 one or more portions of the oral cavity which are associated with presence of 
the tumor. For oral cancer patients, the detected cancer-associated RNA 
signature is likely to originate from the matched tumor and/or a systemic 
response (local or distal) that further reflects itself in the whole saliva coming 
from each of the three major sources (salivary glands, gingival crevicular fluid, 

10 and oral mucosal cells). It is conceivable that disease-associated RNA can 
find its way into the oral cavity via the salivary gland or circulation through the 
gingival crevicular fluid. A good example is the elevated presence of HER-2 
proteins in saliva of breast cancer patients [87]. 

[0070] A common transcriptome of normal cell-free saliva, including 
15 approximately 185 different human mRNAs, also defined as Normal Salivary 
Core Transcriptome (NSCT) was identified in outcome of a transcriptome 
analysis performed on cell-free fluid phase of saliva from normal subject (see 
Example 2, Table 2). 

[0071] Since the NSCT was identified using the probe sets on HG U1 33A 
20 microarray representing only -19,000 human genes, and the human genome 
composed of more than 30,000 genes [48], it is expected that more human 
mRNAs will be identified in saliva by other methodologies and additional 
salivary patterns are identifiable by the method herein disclosed. 

[0072] The NSCT and/or other salivary transcriptome patterns in cell-free 
25 saliva from normal populations can serve in a Salivary Transcriptome 
Diagnostics (SIvTD), for potential applications in disease diagnostics as well 
as normal health surveillance. 

[0073] Accordingly, in a first embodiment of the SIvTD, a method to 
diagnose an oral or systemic pathology disease or disorder in a subject, is 
30 disclosed. The method comprises: providing a cell-free fluid phase portion of 
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portion an mRNA profile of a gene associated with the disease; and 
comparing the RNA profile of the gene with a predetermined mRNA profile of 
the gene, the predetermined mRNA profile of the gene being indicative of the 
5 presence of the disease in the subject. 

[0074] In a second embodiment of the SIvTD, a method to diagnose an oral 
or systemic pathology disease or disorder in a subject, is disclosed. The 
method comprises: providing cell-free saliva supernatant of the subject; 
detecting in the cell-free saliva supernatant a transcriptome pattern 
10 associated with the pathology disease or disorder; and comparing the 
transcriptome pattern with a predetermined pattern, recognition in the 
transcriptome pattern of characteristics of the predetermined pattern being 
diagnostic for the pathology disease or disorder in the subject. 

[0075] In a third embodiment of the SIvTD, a method to identify a biomarker 
15 associated with a predetermined pathology disease or disorder is disclosed. 
The method comprises: detecting a first mRNA profiling of a predetermined 
gene in cell-free fluid phase portion of saliva of a subject affected by the 
pathology disease or disorder; detecting a second mRNA profiling of the 
predetermined gene in cell-free fluid phase portion of saliva of a normal 
20 subject; comparing the first mRNA profiling with the second mRNA profiling, 
recognition of differences between the first mRNA profiling and the second 
mRNA profiling, the differences validated by statistical analysis, being 
indicative of the identification of the predetermined gene as a biomarker for 
the predetermined pathology disease or disorder. 

25 [0076] In particular the difference between the RNA profiling from one 
disease category to one healthy category is analyzed by microarray statistical 
methodologies. The algorithms used include MAS 5.0, DNA-Chip analyzer 1 .3 
and RMA 3.0. Preferably, the analysis is performed by a combination of these 
methods to provide more powerful and accurate markers to test. The markers 

30 identified by microarray will then be tested by conventional techniques such 
as Q-PCR. 
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performed, wherein the cell-free saliva is contacted with an identifier for the 
presence or expression of the biomarker, and the presence of the identifier 
associated to presence or expression of the biomarker is detected, preferably 
5 by means of a detector. 

[0078] The SIvTD allow detection of diseases such as tumors at a stage 
early enough that treatment is likely to be successful, with screening tools 
exhibiting the combined features of high sensitivity and high specificity. 
Moreover, the screening tool are sufficiently noninvasive and inexpensive to 
10 allow widespread applicability. 

[0079] The results of the above methods of the SIvTD can be integrated with 
a corresponding analysis performed at an mRNA and/or protein level and/or in 
other bodily fluid, such as blood serum. 

[0080] Biomarkers, such as protein or transcriptome patterns detected in 
15 serum can also serve in a Serum Transcriptome Diagnostics (SrmTD), for 
potential applications in disease < diagnostics as well as normal health 
surveillance. Embodiments of the SrmTD include methods corresponding to 
the ones reported above for the SIvTD, wherein the bodily fluid analyzed is 
serum instead of cell-free saliva. 

20 [0081] In particular, the results obtained following the SIvTD can be 
combined with results obtained with the SrmTD, in a combined Salivary and 
Serum Transicriptome approach (SSTD). 

[0082] According to the SSTD a diagnostic method can be performed, 
wherein the bodily fluid, serum and/or saliva is contacted with an identifier for 
25 the presence or expression of the biomarker, wherein the biomarker can be a 
protein or an mRNA and the presence of the identifier associated to presence 
or expression of the biomarker is detected, preferably by means of a detector. 

[0083] Examples of the SIvTD, SrmTD and SSTD are herein provided with 
reference to the OSCC. The person skilled in the art can derive the 
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than OSCC upon reading of the present disclosure. 

[0084] Profiling of two specific cytokines, IL6 and IL8, was measured in the 
cell-free fluid phase portion of saliva and serum of patients with OSCC 

5 according to procedures extensively disclosed in Examples 4-8. IL8 was 
detected at higher concentrations in the saliva of patients with OSCC (P < 
0.01) and IL6 was detected at higher concentrations in the serum of patients 
with OSCC (P < 0.01). These results were confirmed at both the mRNA and 
the protein levels, and the results were concordant. The concentration of IL8 

10 in saliva and IL6 in serum did not appear to be associated with gender, age, 
or alcohol or tobacco use (P > 0.75). The data were subjected to statistical 
analysis, in particular to ROC analysis, and were able to determine the 
threshold value, sensitivity, and specificity of each biomarker for detecting 
OSCC (see Example 8, Table 3). Furthermore, the inventors were able to 

15 measure mRNA in salivary specimens. 

[0085] A transcriptome analysis of unstimulated saliva collected from 
patients with OSCC and normal subjects was performed as disclosed in 
Examples 9-12 and in Examples 13-16. 

[0086] RNA isolation was performed from the saliva supernatant, followed 
20 by two-round linear amplification with T7 RNA polymerase. Human Genome 
U133A microarrays were applied for profiling human salivary transcriptome. 
The different gene expression patterns were analyzed by combining a t test 
comparison and a fold-change analysis on 10 matched cancer patients and 
controls. Quantitative polymerase chain reaction (qPCR) was used to validate 
25 the selected genes that showed significant difference (P < 0.01) by 
microarray. The predictive power of these salivary mRNA biomarkers was 
analyzed by receiver operating characteristic curve and classification models. 

[0087] The results of a first set of microarray analysis showed that there are 
1,679 genes exhibited significantly different expression level in saliva between 
30 cancer patients and controls (P < 0.05). Seven cancer-related mRNA 
biomarkers that exhibited at least a 3.5-fold elevation in OSCC saliva (P < 
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u.ltij were consistently vauaatea Dy qruK on sauva samples irum ^ooo 
patients (n = 32) and controls (n =32). These salivary/ RNA biomarkers are 
transcripts of IL8, IL1B, DUSP1, H3F3A, OAZ1, S^IOOP, and SAT. The 
combinations of these biomarkers yielded sensitivity (91%) and specificity 
5 (91%) in distinguishing OSCC from the controls, (see Examples 13-16) 

[0088] The results of a second set of microarray analysis showed five of ten 
up-regulated genes selected based on their reported cancer-association, 
showed significantly elevated transcripts in serum of OSCC patient. These 
RNA biomarkers are transcripts of H3F3A, TPT1, FTH 1, NCOA4 and ARCR. 
10 The results validated by qPCR confirmed that transcripts of these five genes 
were significantly elevated in the serum of OSCC patient (Wilcoxon Signed 
Rank test, P < 0.05). (See Examples 9 to 12) 

[0089] Using the described collection and processing protocols, the 
presence of ACTB, B2W, GAPDH and RPS9 mRNAs (controls mRNA) were 
15 confirmed in all serum (patients and controls) by RT-PC^R. 

[0090] Accordingly, a method for diagnosing a cancer, in particular OSCC in 
a subject, is disclosed. The method comprises: providing a bodily fluids of the 
subject; detecting in the bodily fluid a profile of a bio marker, the biomarker 
selected from the group consisting of IL8 IL1B, DUSP1, H3F3A, OAZ1, 
20 S100P, SAT, IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR, comparing the 
profile of the biomarker with a predetermined profile of the biomarker, 
recognition in the profile of the biomarker of characteristics of the 
predetermined profile of the biomarker being diagnostic for the cancer. 

[0091] Also method to diagnose oral and/or systemic pathology, disease or 
25 disorder, in particular OSCC, is disclosed. The method comprises using 
salivary mRNAs as biomarkers for oral and/or systemic diseases, in particular 
salivary mRNAs of selected from the group consisting of IL8 IL1B, DUSP1, 
H3F3A, OAZ1, S100P and SAT. 

[0092] Additionally a method to diagnose oral and/or systemic pathology, 
30 disease or disorder, in particular OSCC, is disclosed. "The method comprises: 
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using serum m kin as ana/or proiein as DiomarKers Tor oral ana/or systemic 
diseases, in particular serum mRNAs of selected from the group consisting of 
IL6, H3F3A, TPT1, FTH1, NCOA4 and ARCR, and serum IL6 protein. 

[0093] Given the multifactorial nature of oncogenesis and the heterogeneity 
5 in oncogenic pathways use of combinations of salivary and/or serum 
biomarkers, ensuring higher specificity and sensitivity, to detect the disease, 
is preferred. Multiple statistical strategies reported and risk models described 
in the examples can be used to identify combinations of biomarkers that can 
identify OSCC patients samples and to facilitate assigning the appropriate 
10 serum transcriptome-based diagnosis for patients' specific cancer risk. 

[0094] Monitoring of profile of salivary mRNA in cell-free fluid phase portion 
of saliva and/or in other bodily fluid such as blood serum, can be used in the 
postoperative management of OSCC patients. It could potentially be used for 
monitoring the efficacy of treatment, or disease recurrence after therapy has 
15 concluded. Salivary mRNAs and in particular IL8 may also serve as 
prognostic indicators to direct the treatment of patients: with oral cavity cancer. 
In perspective, high-risk patients ban be directed to more aggressive or 
adjuvant treatment regimens. 

[0095] The use of these biomarkers may also improve the staging of the 
20 tumor. With traditional techniques, the presence of microscopic distant 
disease is often under recognized. In recent years, there has been a shift 
from locoregional failure to distant failure for patients treated for presumed 
locoregional disease. [18] This in part is a reflection of subclinical distant 
disease present prior to the initiation of therapy. Testing for the presence of 
25 biomarkers may allow the detection of small amounts of tumor cells in a 
background of normal tissue. Salivary mRNAs as biomarkers specific for 
head and neck tumors or a panel of such biomarkers may allow the detection 
of distant microscopic disease. For oral cancer, one of the most important 
applications of the STD approach in this respect is to detect the cancer 
30 conversion of oral premalignant lesions. 
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|uuyt>] KroTinng or salivary itiknas can aiso oe usea xo investigaxe me roie 
of genes in the development of cancer, in particular whether the aberrant 
expressions of these genes functionally contribute to the development of 
human OSCC. The biological significance of differential expression of these 
5 genes in head and neck/oral cancer should be determined. Identification of 
cancer-associated genes that are consistently changed in cancer patients will 
provide us not only with diagnostic markers but also with insights about 
molecular profiles involved in head and neck cancer development. 
Understanding the profile of molecular changes in any particular cancer will 
10 be extremely useful because it will become possible to correlate the resulting 
phenotype of that cancer with molecular events. 

[0097] Kits of parts associated with the methods herein disclosed are also 
disclosed. In an exemplary embodiment, a kit comprises: a identifier of a 
biomarker in a bodily fluid, such as a salivary mRNA or protein, and serum 

15 mRNA or protein, the biomarker selected from the group consisting of 1L8 
IL1B, DUSP1, H3F3A, OAZ1, S100P, SAT, IL6, H3F3A, TPT1, FTH1, 
NCOA4 and ARCR; and a detector for the identifier, the identifier and the 
detector to be used in detecting the bodily fluid profile of the biomarker of one 
the methods herein disclosed, wherein the identifier is associated to the 

20 biomarker in the bodily fluid, and the detector is used to detect the identifier, 
the identifier and the detector thereby enabling the detection of the bodily fluid 
profile of the biomarker. 

[0098] The bodily fluid can be saliva, with the detection performed in the 
cell-free fluid phase portion thereof, or another bodily fluid such as blood 
25 serum. 

[0099] The identifier and the detector able to detect the identifier, are 
identifiable by a person skilled in the art. Other compositions and/or 
components that may be suitably included in the kit and are also identifiable 
by a person skilled in the art. 
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[uuiuuj me laentmer ana tne reagenx can oe inciuaea in one or mure 
compositions where the identifier and/or the reagent are included with a 
suitable vehicle, carrier or auxiliary agent. 

[00101] In the diagnostic kits herein disclosed, the agents and identifier 
5 reagents can be provided in the kits, with suitable instructions and other 
necessary reagents, in order to perform the methods here disclosed. The kit 
will normally contain the compositions in separate containers. Instructions, for 
example written or audio instructions, on paper or electronic support such as 
tapes or CD-ROMs, for carrying out the assay, will usually be included in the 
10 kit. The kit can also contain, depending on the particular method used, other 
packaged reagents and materials (i.e. wash buffers and the like). 

[00102] Further details concerning the identification of the suitable carrier 
agent or auxiliary agent of the compositions, and generally manufacturing and 
packaging of the kit, can be identified by the person skilled in the art upon 
15 reading of the present disclosure. 

[00103] The kit of parts herein disclosed can be used in particular for 
diagnostic purpose. As a result a non-invasive diagnostic detection of 
pathologies, diseases or disorder and in particular of oral cavity and 
oropharyngeal cancer in patients, is disclosed. 

20 [00104] The use of the fluid phase of saliva has unique advantages over the 
use of exfoliated cells. Depending on the location of the tumor, one may not 
be able to easily access and swab the tumor bed. Although salivary 
biomarkers could not identify the site from which the tumor originated, they 
could identify patients at risk. Such a saliva test could be administered by 

25 nonspecialists in remote locations as a screening tool to select patients for 
referral for careful evaluation of the upper aerodigestive tract. Finding early 
stage, previously undetected disease may ultimately save lives. Moreover, the 
use of easily accessible biomarkers may prove highly beneficial in large 
populations or chemoprevention trials. This could be envisioned during routine 

30 dental visits or targeted screening of individuals at high risk of development of 
the disease. A home test kit can also be envisioned. 
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[UU1U5J Also tne use ot Dlooa test is envisioned in particular Tor cancer eany 
detection. Recovering the cell-free circulating mRNA or protein biomarkers in 
the serum of cancer patients representing characteristics of tumor genetic 
alteration, such as IL6 mRNA and protein, H3F3A, mRNA TPT1 mRNA , 
5 FTH1 mRNA , NCOA4 mRNA and ARCR mRNA diagnostic for OSCC, could 
be envisioned as a screening test for presence of occult OSCC during routine 
physician's visit with blood work or targeted screening of individuals at high 
risk for oral cancer development. A home test kit can also be envisioned, 
including preferably 

10 [00106] In particular, peripheral blood can be obtained from subjects using 
routine clinical procedures, and mRNA and proteins can be isolated, 
preferably with an optimized procedures herein disclosed. Real time 
quantitative PGR and ELISA for the respective cytokine will be performed for 
one or biomarkers, such as IL6. 

15 [00107] A perspective embodiments of the methods herein disclosed are 
directed towards the eventual creation of micro-/nano-electrical mechanical 
systems (MEMS/NEMS) for the ultrasensitive detection of molecular 
biomarkers in oral fluid. RNA and protein expression for the validated OSCC 
biomarkers will be selected as targets for cancer detection. The integration of 

20 these detection systems for the concurrent detection of mRNA and protein for 
multiple OSCC biomarkers will result in an efficient, automated, affordable 
system for oral fluid based cancer diagnostics. 

[00108] Further details concerning reagents, conditions, compositions 
techniques to be used in the method and kits of the disclosure are identifiable 
25 by a person skilled in the art upon reading of the present disclosure. 

[00109] Also appropriate modifications of the STD methods and kits herein 
disclosed and exemplified as associated to OSCC and/or HSNCC, for the 
mRNA profiling and transcriptome analysis associated with investigation and 
diagnosis of other pathology diseases and disorders can be made by a 
30 person skilled in the art upon reading of the present disclosure. 
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Loonuj ine Tonowing examples are proviaea to aescnoe tne invention in 
further detail. These examples, which set forth a preferred mode presently 
contemplated for carrying out the invention, are intended to illustrate and not 
to limit the invention 

5 Examples 

Example 1: RNA Isolation, Amplification and gene expression 

PROFILING FROM CELL-FREE SALIVA OF NORMAL DONORS 

Normal subjects 

[00111] Saliva samples were obtained from ten normal donors from the 
10 Division of Otolaryngology, Head and Neck Surgery, at the Medical Center, 
University of California, Los Angeles (UCLA), CA, in accordance with a 
protocol approved by the UCLA Institutional Review Board. The following 
inclusion criteria were used: age 30 years; no history of malignancy, 
immunodeficiency, autoimmune disorders, hepatitis, HIV infection or smoking. 
15 The study population was composed of 6 males and 4 females, with an 
average age of 42 years (range from 32 to 55 years). 

Saliva collection and processing to obtain the relevant fluid phase 

[00112] Unstimulated saliva were collected between 9 am and 10 am in 
accordance with published protocols [38]. Subjects were asked to refrain from 

20 eating, drinking, smoking or oral hygiene procedures for at least one hour 
prior to saliva collection. Saliva samples were centrifuged at 2,600 x g for 15 
min at 4 °C. Saliva supernatant was separated from the cellular phase. RNase 
inhibitor (Superase-ln, Ambion Inc., Austin, TX, USA) and protease inhibitor 
(Aprotinin, Sigma, St. Louis, MO, USA) were then added into the cell-free 

25 saliva supernatant. 

RNA isolation from cell-free saliva 

[00113] RNA was isolated from cell-free saliva supernatant using the 
modified protocol from the manufacturer (QIAamp Viral RNA kit, Qiagen, 



22 



WO 2005/081867 



PCT/US2005/005263 



Valencia, um, uonj. oanva ^oou pL.;, nnxcu owtsn wiui mvl uumcji ^,^u hw. 
was incubated at room temperature for 10 min. Absolute ethanol (2,240 pL) 
was added and the solution passed through silica columns by centrifugation at 
6,000 x g for 1 min. The columns were then washed twice, centrifuged at 
5 20,000 x g for 2 min, and eluted with 30 pL RNase free water at 9,000 x g for 
2 min. Aliquots of RNA were treated with RNase-free DNase (DNase I-DNA- 
free, Ambion Inc., Austin, TX, USA) according to the manufacturer's 
instructions. 

[00114] The stability of the isolated RNA was examined by RT-PCR typing for 
10 actin-|3 (ACTB) after storage for 1, 3, and 6 months. The results reported on 
Figure 1A show that the mRNA isolated could be preserved without significant 
degradation for more than 6 month at -80 °C. 

[00115] The quality of isolated RNA was examined by RT-PCR for three 
house-keeping gene transcripts: glyceraldehyde-3-phosphate dehydrogenase 

15 (GAPDH), actin-p (ACTB) and ribosomal protein S9 (RPS9). Primers were 
designed using PRIMERS software (http://www.genome.wi.mit.edu) and were 
synthesized commercially (Fisher Scientific, Tustin, CA, USA) as follows: the 
primers having the sequence reported attached sequence listing as SEQ ID 
NO: 1 and SEQ ID NO: 2 for GAPDH; the primers having the sequence 

20 reported attached sequence listing as SEQ ID NO: 3 and SEQ ID NO: 4 for 
ACTB; the primers having the sequence reported attached sequence listing 
as SEQ ID NO: 5 and SEQ ID NO: 6 for RPS9. The quantity of RNA was 
estimated using Ribogreen® RNA Quantitation Kit (Molecular Probes, Eugene, 
OR, USA). The results are shown in Figure 1B , wherein GAPDH (B1), RPS9 

25 (B2) and ACTB (B3) were detected consistently in all 10 cases tested, 
demonstrating that all 10 saliva samples contain mRNAs that encode for 
house keeping genes: GAPDH, ACTB and RPS9. 

[00116] The mRNA of these genes could be preserved without significant 
degradation for more than 6 months at -80 °C, (see results for ACTB reported 
30 on Fig. 1A). 
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i arqei ckimm preparation 

[00117] Isolated RNA was then subjected to linear amplification according to 
published method from our laboratory (Ohyama et a/., 2000). In brief, reverse 
transcription using T7-oligo-(dT)24 as the primer was performed to synthesize 
5 the first strand cDNA. The first round of in vitro transcription (IVT) was carried 
out using T7 RNA polymerase (Ambion Inc., Austin, TX, USA). The 
BioArrayTM High Yield RNA Transcript Labeling System (Enzo Life Sciences, 
Farmingdale, NY, USA) was used for the second round IVT to biotinylate the 
cRNA product; the labeled cRNA was purified using GeneChip® Sample 
10 Cleanup Module (Affymetrix, Santa Clara, CA, USA). 

[00118] The quantity and quality of cRNA were determined by 
spectrophotometry and gel electrophoresis. Exemplary results of agarose gel 
electrophoresis test reported on Figure 2A show different quantities of 
amplified cRNA at the different stages of the RNA amplification. 

15 [00119] Also small aliquots from each of the isolation and amplification steps 
were used to assess the quality by RT-PCR. Exemplary results reported in 
Figure 2B show PCR typing ACTB performed at the various stages of RNA 
amplification, wherein the expected single band (153bp) can be detected in 
every main step of the salivary RNA amplification process. 

20 [00120] The quality of the fragmented cRNA (prepared as described by Kelly, 
2002) was also assessed by capillary electrophoresis using the 2100 
Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). Exemplary results 
reported in Figure 2C show one single peak in a narrow range (50-200bp) 
demonstrating proper fragmentation. 

25 Gene expression profiling in the targeted cRNA preparation 

[00121] Gene expression profiling was performed in cell free-saliva obtained 
from ten normal donors, wherein on average, 60.5 ± 13.1 ng (n=10) of total 
RNA was obtained from 560 pL cell-free saliva samples. The results are 
reported on Table 1 . 
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Table 1. 



Subject 


Gender 


Age 


RNA (ng)a 


cRNA (~tg)'~ 


Present Probesc 


Probe ~° 


1 


F 


53 


60.4 


44.3 


3172 


14.24 


2 


M 


42 


51.6 


40.8 


2591 


11.62 


3 


M 


55 


43.2 


34.8 


2385 


A f\ ~7C\ 


4 


M 


42 


48.2 


38.0 


2701 


12.12 


5 


M 


46 


60.6 


42.7 


3644 


16.35 


6 


M 


48 


64.8 


41.8 


2972 


13.34 


7 


F 


40 


75.0 


44.3 


2815 


12.63 


8 


M 


33 


77.8 


49.3 


4159 


18.66 


9 


F 


32 


48.8 


41.4 


2711 


12.17 


10 


F 


32 


79.8 


44.4 


4282 


19.22 


Mean+SD 




42±8.3 


60.5±13.12 


42.2±3.94 


3143+665.0 


14.1 1±2. 


[00122] The 


total RNA 


quantity 


is the RNA 


in 560p.L cell-free 


saliva 



supernatant; the cRNA quantity is after two rounds of T7 amplification. 
Number of probes showing present call on HG U133A microarray (detection 
5 p<0.04). Present percentage (P%) = Number of probes assigned present call / 
Number of total probes (22,283 for HG U133A microarray). 

[00123] After two rounds of T7 RNA linear amplification, the average yield of 
biotinylated cRNA was 42.2 ± 3.9 P9 with A260/280=2.067 ± 0.082 (Tablel). 
The cRNA ranged from 200 bp to 4 kb before fragmentation; and was 

10 concentrated to approximately 1 0Obp after fragmentation. The quality of cRNA 
probe was confirmed by capillary electrophoresis before the hybridizations. 
ACTB mRNA was detectable using PCR/RT-PCR on original sample and 
products from each amplification steps: first cDNA, first In Vitro Transcription 
(IVT), second cDNA and second IVT, with a resulting agarose electrophoresis 

15 pattern comparable to the one shown in Fig. 2B. 

Example 2: Microarray Profiling of mRNA from cell-free saliva 
of normal donors 

[00124] Saliva was collected processed and the RNA isolated as reported in 
Example 1 . Also, stability, quality and quantity of the RNA was assessed are 
20 reported in Example 1 . 
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nb-ui^ iA ivncroarrav analysis 

[00125] The Affymetrix Human Genome U133A Array, which contains 22,215 
human gene cDNA probe sets representing -19,000 genes (i.e., each gene 
may be represented by more than one probe sets), was applied for gene 

5 expression profiling. The array data were normalized and analyzed using 
Microarray Suite (MAS) software (Affymetrix). A detection p-value was 
obtained for each probe set. Any probe sets with p < 0.04 was assigned 
"present", indicating the matching gene transcript is reliably detected 
(Affymetrix, 2001). The total number of present probe sets on each array was 

10 obtained and the present percentage (P%) of present genes was calculated. 
Functional classification was performed on selected genes (present on all ten 
arrays, p < 0.01) by using the Gene Ontology Mining Tool (www.netaffx.com). 

[00126] Salivary mRNA profiles of ten normal subjects were obtained using 
HG U133A array contains 22,283 cDNA probes. An average of 3,143 ± 665.0 

15 probe sets (p < 0.04) was found on each array (n=10) with assigned present 
calls. These probe sets represent approximately 3,000 different rnRNAs. The 
average present call percentage was 14.11 ± 2.98% (n=10). A reference 
database which includes data from the ten arrays was generated. The probe 
sets representing GAPDH, ACTB and RPS9 assigned present calls on all 10 

20 arrays. There were totally 207 probe sets representing 185 genes assigned 
present calls on all 10 arrays with detection p < 0.01. These 10 genes were 
categorized on the basis of their known roles in biological processes and 
molecular functions. Biological processes and molecular functions of 185 
genes in cell-free saliva from ten normal donors (data obtained by using Gene 

25 Ontology Mining Tool) are reported on Table 2. 
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Table 2. 


Biological process 3 


Genes,nb 


Molecular function 3 


Genes, nb 


oeii growin ana/or maintenance 


1 1 Q 
I I y 


Dinuing 


I l o 


MeiaDOiism 


yo 


rviucieic acia Dinuing 


oy 


DlUoyl 1 LI IfcJolo 


70 
i \j 


RNIA hiriHinn 
rvi n/a ljii i vj 1 1 ly 


73 


PrAtoin motahr\lioiY> 

" roioin i McHciDoiiom 


7fi 
# \j 


r^alr'ii im ion hinHinn 


12 

1 £m 


iNuLrieoiiuo rneiaDuiism 




v^Li it;i uii iuii ly 


OO 
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1 « 
I o 


oiruciurai imoiecuie 




oeu organization ana Diogenesis 


9 


r\IUUbUf llcll LrUl IbLlLUtJI 1L 


71 
I o 


Homeostasis 


O 

o 


oytosKeieion oonsiixuent 


17 
i I 


oeii cycie 
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Cell proliferation 


1 1 
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1^ 

i o 
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O 


i ransponer 


A 

Hr 




« 


Fn7vmp 

^_i iz_ y iii w 


20 


oeii communication 




oignai iranbuuouur i 


10 
I u 


Kesponse xo exiernai sxirnuius 


1Q 

i y 


1 1 al loLrl IUUUI 1 loyUldLUI 


7 


Cell adhesion 


3 


Translation regulator 


5 


Cell-cell signaling 


5 


Enzyme regulator 


9 


Signal transduction 


17 


Cell adhesion molecule 


1 


Obsolete 


8 


Molecular function unknown 


6 


Development 


18 






Death 


2 






Biological process unknown 


11 







V 



[00127] One gene may have multiple molecular functions or participate in 
different biological processes. Number of genes classified into a certain 
5 group/subgroup. The major functions of the 185 genes are related to cell 
growth/maintenance (119 genes), molecular binding (118 genes) and cellular 
structure composition (95 genes). We termed these 185 genes as "Normal 
Salivary Core Transcriptome (NSCT)". 

Example 3: Q-PCR Validation and Quantitation Analysis of 

10 MlCROARRAY PROFILING FROM CELL-FREE SALIVA OF NORMAL DONORS 

[00128] The Microarray analysis performed in Example 2 was validated 
through a quantitative gene expression analysis by Q-PCR 
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uuanntative gene expression analysis py u-kuk 

[00129] Real time quantitative PGR (Q-PCR) was used to validate the 
presence of human mRNA in saliva by quantifying selected genes from the 
185 "Normal Salivary Core Transcriptome" genes detected by the Microarray 
5 profiling reported in Example 2. Genes IL1B, SFN and K-ALPHA-1, which 
were assigned present calls on all 10 arrays, were randomly selected for 
validation. 

[00130] Q-PCR was performed using iCyclerTM thermal Cycler (Bio-Rad, 
Hercules, CA, USA). A 2 jjL aliquot of the isolated salivary RNA (without 

10 amplification) was reverse transcribed into cDNA using MuLV Reverse 
Transcriptase (Applied Biosystems, Foster City, CA, USA). The resulting 
cDNA (3 [jL) was used for PGR amplification using iQ SYBR Green Supermix 
(Bio-Rad, Hercules, CA, USA). The primers were synthesized by Sigma- 
Genosys (Woodlands, TX, USA) as follows: the primers having the sequence 

15 reported attached sequence listing as SEQ ID NO: 7 and SEQ ID NO: 8 for 
interleukin 1, beta (IL1B); the primers having the sequence reported attached 
sequence listing as SEQ ID NO:9 nd SEQ ID NO: 10 for stratifin (SFN); the 
primers having the sequence reported attached sequence listing as SEQ ID 
NO: 11 and SEQ ID NO: 12 for tubulin, alpha, ubiquitous (K-ALPHA-1). All 

20 reactions were performed in triplicate with conditions customized for the 
specific PGR products. The initial amount of cDNA of a particular template 
was extrapolated from a standard curve using the LightCycler software 3.0 
(Bio-Rad, Hercules, CA, USA). The detailed procedure for quantification by 
standard curve has been previously described (Ginzinger, 2002). 

25 [00131] Q-PCR results showed that mRNA of IL1B, SFN and K-ALPHA-1 
were detectable in all 10 original, unamplified, cell-free saliva. The relative 
amounts (in copy number) of these transcripts (n=10) are: 8.68 x 103 ± 4.15 x 
103 for IL1B; 1.29 x 105 ± 1.08 x105 for SFN; and 4.71 x 106 ± 8.37 x 105 for 
K-ALPHA-1 . The relative RNA expression levels of these genes measured by 

30 Q-PCR were similar to those measured by the microarrays (data not shown). 
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fcXAMPLE 4: ILt> AND ILS MKNA ISOLATION AMPLIFICATION AND ANALYSIS 
OF THE EXPRESSION IN CELL-FREE SALIVA OF OSCC PATIENTS 

Patients selection 

[00132] Patients were recruited from the Division of Head and Neck Surgery 
5 at the University of California, Los Angeles (UCLA) Medical Center, Los 
Angeles, CA; the University of Southern California (USC) Medical Center, Los 
Angeles, CA; and the University of California San Francisco (UCSF) Medical 
Center, San Francisco, CA, over a 6 -month period. 

[00133] Thirty-two patients with documented primary T1 or T2 squamous cell 
10 carcinoma of the oral cavity (OC) or oropharynx (OP) were included in this 
study. All patients had recently been diagnosed with primary disease, and 
had not received any prior treatment in the form of chemotherapy, 
radiotherapy, surgery, or alternative remedies. An equal number of age and 
sex matched subjects with comparable smoking histories were selected as a 
15 control comparison group. 

[00134] Among the two subject groups, there were no significant differences 
in terms of mean age (standard deviation, SD): OSCC patients, 49.3 (7.5) 
years; normal subjects, 48.8 (5.7) years (Student's t test P > 0.80); gender 
(Student's t test P > 0.90); or smoking history (Student's t test P > 0.75). No 
20 subjects had a history of prior malignancy, immunodeficiency, autoimmune 
disorders, hepatitis, or HIV infection. Each of the individuals in the control 
group underwent a physical examination by a head and neck surgeon, to 
ensure that no suspicious mucosal lesion was present. 

Saliva Collection And Processing 

25 [00135] Informed consent had been given by all patients. Saliva and serum 
procurement procedures were approved by the institutional review board at 
each institution: the University of California, Los Angeles (UCLA); the 
University of Southern California (USC); and the University of California San 
Francisco (UCSF). 
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[00136] Saliva Trom 32 patients with UU or UP suua, ana \$z unaffected 
age- and gender-matched control subjects were obtained for a prospective 
comparison of cytokine concentration. 

[00137] The subjects were required to abstain from eating, drinking, smoking, 
5 or using oral hygiene products for at least one hour prior to saliva collection. 
Saliva collection was performed using the "draining (drooling)" method of 
Navazesh and Christensen,[7] for a total donation of 5 cc saliva. Saliva 
samples were subjected to centrifugation at 3500 rpm (2600xg) for 15 minutes 
at 4°C by a Sorvall RT6000D centrifuge (DuPont, Wilmington, DE). The fluid- 

10 phase was then removed, and RNAse (Superase-ln, RNAse Inhibitor, Ambion 
Inc., Austin, TX) and protease (Aprotinin, Sigma, St. Louis, MO; 
Phenylmethylsulfonylfluoride, Sigma, St. Louis, MO; Sodium Orthovanadate, 
Sigma, St. Louis, MO) inhibitors were then added promptly on ice. The 
conditions for the separation of the cellular and fluid phases of saliva were 

15 optimized to ensure no mechanical rupture of cellular elements which would 
contribute to the mRNA detected in the fluid phase. All samples were 
subsequently treated with DNAse (DNAsel-DNA-free, ;Ambion Inc., Austin, 
TX). The cell pellet was retained and stored at -80°C. 

RNA Isolation from cell-free saliva 

20 [00138] 560 pL of saliva supernatant were then processed using the QIAamp 
Viral RNA mini kit (QIAGEN, Chatsworth, CA) kit. RNA was extracted 
according to the manufacturer's instructions. Samples were air-dried and 
resuspended in water treated with diethyl pyrocarbonate and were kept on ice 
for immediate usage or stored at -80°C. Aliquots of RNA were treated with 

25 RNAse-free DNAse (DNAsel-DNA-free, Ambion Inc., Austin, TX) according to 
the manufacturer's instructions. Concentrations of RNA were determined 
spectrophotometrically, and the integrity was checked by electrophoresis in 
agarose gels containing formaldehyde. 
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[00139] Presence of IL6 and IL8 mRNA transcripts in the fluid phase in saliva 
was tested by using reverse transcriptase-polymerase chain reaction (RT- 
PCR). 

5 [00140] RNA from each sample was reverse-transcribed in 40 pL of reaction 
mixture containing 2.5 U of Moloney murine leukemia virus reverse 
transcriptase (Applied Biosystems lnc.(ABI, Foster City, CA) and 50 pmol of 
random hexanucleotides (ABI, Foster City, CA ) at 42°C for 45 minutes. 
Based on the published sequences, oligonucletide primers were synthesized 

10 commercially at Fisher Scientific (Tustin, CA) for PGR as follows: the primers 
having the sequence reported attached sequence listing as SEQ ID NO: 13 
and SEQ ID NO: 14 for p-actin; the primers having the sequence reported 
attached sequence listing as SEQ ID NO: 15 and SEQ ID NO: 16 for IL8; and 
the primers having the sequence reported attached sequence listing as SEQ 

15 ID NO: 17 and SEQ ID NO: 18 for IL6. 

[00141] Amplification of the complementary DNA (cDNA) was carried out 
using 50 cycles at 95 °C for 20 seconds, 60 °C for 30 seconds, and 72°C for 
30 seconds; followed by a final extension cycle \of 72 °C for 7 minutes. 
Specificity of the PGR products was verified by the predicted size and by 

20 restriction digestion. To establish the specificity of the responses, negative 
controls were used in which input RNA was omitted or in which RNA was 
used but reverse transcriptase omitted. As a positive control, mRNA was 
extracted from total salivary gland RNA (Human Salivary Gland Total RNA, 
Clontech, Palo Alto, CA). To ensure RNA quality, all preparations were 

25 subjected to analysis of expression. 

[00142] The RT-PCR studies so performed showed that saliva and serum 
contained mRNA encoding for IL6 and IL8. Exemplary results reported in 
Figure 3 , show PCR products of the sizes (95 bp for IL6 and 88 bp for IL8) 
that were expected from the selected primers. The same-sized products were 
30 expressed in the positive control. 
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[00143] in order to ensure that the kima ana protein anaiyzea were irom ine 
fluid phase of saliva only and to ensure the lack of contamination by 
intracellular components, the centrifugation speed for the saliva samples was 
optimized. PCR for the housekeeping genes p-actin and ubiquitin on whole 
5 saliva samples, and samples that had been centrifuged at various speeds 
using DNA as a marker of cell lysis and spillage of intracellular components. 
The results support an optimal centrifugation speed for saliva samples of 
2,600 ±52 xg, with a preferred speed of 2,600 xg (see exemplary results 
reported on Figure 4 ) 

10 Example 5: IL6 and IL8 mRNA Isolation, amplification and analysis 

OF THE EXPRESSION IN SERUM OF OSCC PATIENTS 

[00144] Patients recruited as reported in Example 4, where subjected to 
analysis of presence of IL6 and IL8 mRNA in blood serum. 

Serum collection and processing 

15 [00145] Serum from 19 patients with OC or OP SCCA, and 32 unaffected 
age- and gender-matched control subjects were obtained for a prospective 
comparison of cytokine concentration. Among the subject groups, there were 
no significant differences in terms of age, gender, alcohol consumption, or 
smoking history (P > 0.75). 

20 [00146] Blood was drawn from control subjects and patients prior to 
treatment. Sera were collected by centrifuging whole blood at 3000 rpm 
(1000xg) for 10 minutes at 15°C by a Sorvall RT6000D centrifuge (DuPont, 
Wilmington, DE). Serum was then separated, and RNAse (Superase-ln, 
RNAse Inhibitor, Ambion Inc., Austin, TX) and protease (Aprotinin, Sigma, St. 

25 Louis, MO; Phenylmethylsulfonylfluoride, Sigma, St. Louis, MO; Sodium 
Orthovanadate, Sigma, St. Louis, MO) inhibitors were then added promptly on 
ice. All samples were subsequently treated with DNAse (DNAsel-DNA-free, 
Ambion Inc., Austin, TX). The aliquots were stored at -80°C until further use. 
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Keverse i ranscripTase-KOivmerase unain Keauuun 

[00147] Presence of IL6 and IL8 mRNA transcripts in the serum was tested 
by using reverse transcriptase-polymerase chain reaction (RT-PCR) 
performed as described in Example 4 above. 

5 [00148] The RT-PCR studies so performed showed that serum contained 
mRNA encoding for IL6 and IL8, with electrophoresis gel pattern comparable 
to the one shown in Figure 3. 

[00149] In order to ensure that the RNA and protein analyzed were from the 
fluid phase of serum only and to ensure the lack of contamination by 
10 intracellular components, the centrifugation speed for the serum samples was 
optimized following the same approach described in Example 4 for saliva 
samples. The results support an optimal centrifugation speed for saliva 
samples of 1,000 ±20 *g with a preferred speed of 1,000 xg. 

EXAMPLE 6: IL6 AND IL8 CYTOKINE LEVELS ANALYSIS IN SALIVA FROM 
15 OSCC PATIENTS 



[00150] On demonstrating that IL6 and IL8 mRNA transcripts were present in 
the fluid phase in saliva, we prospectively examined and compared the levels 
of IL6 and IL8 in the saliva of unaffected subjects and patients with OSCC 
using quantitative real time PCR (qRT-PCR) and ELISA. 

20 [00151] Saliva from 32 patients with OSCC, and 32 age- and gender- 
matched control subjects were obtained. Among the subject groups, there 
were no significant differences in terms of age, gender, alcohol consumption, 
or smoking history (P> 0.75). 

Real Time PCR for Quantification of IL6 and 1L8 mRNA Concentrations in 
25 Saliva from Patients and Normal Subjects 

[00152] To analyze quantitatively the result of RT-PCR, quantitative real-time 
PCR (Bio-Rad iCycler, Thermal Cycler, Bio-Rad Laboratories, Hercules, CA) 
was used. Each sample was tested in triplicate. The amplification reactions 
were carried out in a 20 pL mixture, using iQ SYBR Green Supermix (Bio-Rad 
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Lauuicnuiica, nciuuies, umj. Muer iniucti utif laiurauon ax yo~u Tor 3 minuies, 
50 PCR cycles were performed at 60°C for 20 seconds, then 20 seconds at 
72°C, then 20 seconds at 83°C, followed by 1 minute at 95°C, then followed 
by a final 1 minute extension at 55°C. Aliquots were taken from each well and 
5 checked by electrophoresis in agarose gels in order to ensure the specificity 
of the products. 

[00153] The RT-PCR results are illustrated by the diagram shown in Figure 5 
A, Such results show that IL8 at both the mRNA and protein levels, was 
detected in higher concentrations in the saliva of patients with OSCC when 
10 compared with control subjects (t test, P< 0.01). There was a significant 
difference in the amount of IL8 mRNA expression between saliva from OSCC 
patients and disease-free controls. The mean copy number was 1.1 x 10 3 for 
the OSCC group, and 2.6 x 10 1 for the control group. The difference between 
the two groups was highly statistically significant (P<0.0008). 

15 [00154] No significant differences were instead found in the salivary 
concentration of IL6 at the mRNA level. Within the sample size studies, the 
inventors were also unable to detect differences between smoking and 
nonsmoking subjects. < 

ELISA for Quantification of IL6 and IL8 Protein Concentrations in Saliva from 
20 Patients and Normal Subjects 

[00155] ELISA kits for IL6 and IL8 were used (Pierce Endogen, Rockford, IL) 
according to the manufacturer's protocol. Each sample was tested in 
duplicate in each of two replicate experiments. After development of the 
cdlorimetric reaction, the absorbance at 450 nm was quantitated by an eight 

25 channel spectrophotometer (EL800 Universal Microplate Reader, BIO-TEK 
Instruments Inc., Winooski, VT), and the absorbance readings were converted 
to pg/rnl based upon standard curves obtained with recombinant cytokine in 
each assay. If the absorbance readings exceeded the linear range of the 
standard curves, ELISA assay was repeated after serial dilution of the 

30 supernatants. Each sample was tested in at least two ELISA experiments, 
and the data were calculated from the mean of tests for each sample. 
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juu iooj i ne izliom nnaings are musuaLtnj uy me diagram snown in rmuit; 
5B. The levels of IL8 in the saliva of OSCC patients were significantly higher 
(720 pg/dL) than those in the saliva of the control group (250 pg/dL) 
(PO.0001). To ensure that the elevated levels of IL8 protein in saliva were 
5 not due to an elevation of total protein levels in the saliva of OSCC patients, 
we compared the total protein concentrations in saliva among the two groups. 
No significant differences were found (P> 0.05). 

[00157] No significant differences were found in the salivary concentration of 
IL6 at the protein level. Also in the ELISA analysis, no differences were 
10 detected within the sample size studies between smoking and nonsmoking 
subjects. 

Example 7: IL6 and IL8 Cytokine Levels Analysis in Serum from 
OSCC Patients 

[00158] We also examined and compared the levels of IL6 and IL8 in the 
15 serum of unaffected subjects and patients with OSCC using qRT- PGR and 
ELISA. The patients were selected as described in Example 4 and the serum 
processed as described in Example 5. 

Real Time PCR for Quantification of IL6 and IL8 mRNA Concentrations in 
Serum from Patients and Normal Subjects 

20 [00159] To analyze quantitatively the result of RT-PCR, quantitative real-time 
PCR was performed as described in Example 6. 

[00160] The RT-PCR results are illustrated by the diagram shown in Figure 6 
/V Such results show that IL6 at mRNA level was detected in higher 
concentrations in the serum of patients with OSCC when compared with 
25 control subjects (t test, P < 0.001). We noted a significant difference in the 
amount of IL6 mRNA expression between serum from OSCC patients and 
disease-free controls. The mean copy number was 5.2 x 10 4 for the OSCC 
group, and 3.3 x 10 3 for the control group. The difference between the two 
groups was highly statistically significant (P<0.0004). 
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Luuitnj no signmcant differences were msieaa rauna in ine seiuni 
concentration of IL8 at the mRNA level. Within the sample size studies, the 
inventors were also unable to detect differences between smoking and 
nonsmoking subjects. 

5 ELISAfor Quantification of IL6 and IL8 Protein Concentrations in Serum from 
Patients and Normal Subjects 

[00162] ELISA tests for quantification of IL6 and IL8 protein concentrations in 
serum were performed as described in Example 6. 

[00163] The relevant ELISA findings are illustrated by the diagram shown in 
10 Figure 6 B . The mean levels of IL6 in the serum of OSCC patients were 
significantly higher (87 pg/dL) than those in the serum of the control group (0 
pg/dL) (PO.0001). 

[00164] No significant differences were found in the serum concentration of 
IL8 at the protein level. Also in the ELISA analysis, no differences were 
15 detected within the sample size studies between smoking and nonsmoking 
subjects. 

Example 8: ROC am d Sensitivity/specificity Analysis 

[00165] Statistical analysis of the data collected in outcome of the 
experiments reported on Examples 1 to 7 above demonstrates the specificity 
20 and sensitivity of these biomarkers for HNSCC, and their predictive value. 

Statistical Analysis 

[00166] The distributions of patient demographics were calculated overall and 
separately for OSCC cases and controls, and were compared between the 
two arms with either the Student's f-test for continuous measures or two-by- 
25 two Chi~square tables for categorical measures. The distributions of IL6 and 
IL8 levels in saliva and serum were computed and compared between the 
OSCC cases and controls using two independent group f-tests. Differences 
were considered significant for P values less than 0.01. Due to the range of 
the IL6 and IL8 levels, log transformations of these measures were also used 
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in the analyses, uata were expressed as tne mean z ou. ^y^» yenuei, emu 
smoking history were controlled at the group level in the experimental design; 
these patient factors were also adjusted in the analyses when comparing IL6 
and IL8 through regression modeling. 

5 [00167] Using the binary outcome of the disease (OSCC cases) and non- 
disease (controls) as dependent variables, logistic regression models were 
fitted to estimate the probability of developing OSCC as a function of each of 
the potential biomarkers (IL6 or IL8), controlling for patient age, gender, and 
smoking history. Using the fitted logistic models, receiver operating 

10 characteristic (ROC) curve analyses were conducted to evaluate the 
predictive power of each of the biomarkers[8][9][10]. Through the ROC 
analyses, we calculated sensitivities and specificities by varying the criterion 
of positivity from the least (cut at probability of 0) to the most stringent (cut at 
probability of 1). The optimal sensitivity and specificity was determined for 

15 each of the biomarkers, and the corresponding cutoff/threshold value of each 
of the biomarkers was identified. The biomarker that has the largest area 
under the ROC curve was identified as having the strongest predictive power 
for detecting OSCC. 

Clinical Data 

20 [00168] The mean (SD) age of the patients with OSCC was 49.3 (7.5) years 
(range, 42-67 years) vs. 48.8 (5.7) years (range, 40-65 years) in the control 
group; (Student's t test P > 0.80). Among the two subject groups, there were 
no significant differences in terms of age (mean age): OSCC patients, 49.3 
years; normal subjects, 48.8 years (Student's t test P > 0.80); gender 

25 (Students t test P > 0.90); or smoking history (Student's t test P > 0.75). 

[00169] ROC (Receiver Operating Characteristic) curves, plots of sensitivities 
versus 1 -specificities, were generated for each of the potential biomarkers. 
Age, gender, and smoking history were controlled as described above. The 
areas under the ROC curves were calculated, as measures of the utility of 
30 each biomarker for detecting OSCC. 
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[uu i /uj riqure /m ana figure fts snow ine kuu curves ror iuo in saliva emu 
1L6 in serum, respectively. The calculated ROC values (for predicting OSCC) 
were 0.978 for IL8 in saliva; and 0.824 for IL6 in serum. Based on the 
distribution of sensitivities and specificities, thresholds of biomarkers were 
5 chosen for detecting OSCC. Based upon our data, for IL8 in saliva, a 
threshold value of 600 pg/dL yields a sensitivity of 86% and a specificity of 
97%. Similarly, for IL6 in serum, a threshold value of greater than 0 pg/dL 
yields a sensitivity of 64% and a specificity of 81 %. 

[00171] The combination of biomarkers: IL-8 in saliva and lL-6 in serum holds 
10 great potential for OSCC diagnostics as ROC analysis yields a sensitivity of 
99% and a specificity of 90% as shown in Figure 7C. 

[00172] The detailed statistics of the area under the ROC curves, the 
threshold values, and the corresponding sensitivities and specificities for each 
of the potential biomarkers in saliva and in serum are listed in Table 3. 

15 [00173] The detailed statistics of the area under the ROC curves, the 
threshold values, and the corresponding sensitivities and specificities for each 
of the potential biomarkers in saliva and in serum are listed in table 3 below. 



Table 3 



Biomarker 


Area under ROC 


Threshold/Cutoff 


Sensitivity 


Specificity 


IL8 saliva protein 


0.978 


600 pg/mL 


86% 


97% 


IL6 serum 
protein 


0.824 


> 0 pg/mL 


57% 


100% 


IL8 saliva protein 
& IL6 serum 
protein 


0.994 


> 600 pg/ml 
> 0 p/ml 


99% 


90% 



20 Example 9: RNA Isolation, Amplification and gene expression 

PROFILING FROM SERUM OF OSCC PATIENTS 

Subject selection 

[00174] Thirty-two OSCC patients were recruited from Medical Centers at 
University of California, Los Angeles (UCLA) and University of Southern 
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uaiiTornia (U£>u;, los Angeies, u/v Mil panents naa recently oeen aiagnos^u 
with primary T1/T2 OSCC, and had not received any prior treatment in the 
form of chemotherapy, radiotherapy, surgery, or alternative remedies. Thirty- 
five normal donors were recruited as controls from the general population at 
5 School of Dentistry, UCLA. No subjects had a history of prior malignancy, 
immunodeficiency, autoimmune disorders, hepatitis, or HIV infection. All 
subjects signed the Institutional Review Board approved consent form 
agreeing to serve as blood donors for this study. 

[00175] Totally sixty-seven subjects were recruited, including 32 OSCC 
10 patients and 35 normal subjects. Among the two subject groups, there were 
no significant differences in terms of mean age (standard deviation, SD): 
OSCC patients, 49.3 (7.5) years; normal subjects, 47.8 (6.4) years (Student's 
t test P - 0.84). The gender distribution in OSCC group was 10:22 (female 
number/male number) and in control group was 14:21 (Chi-square test P ~ 1). 
15 We matched the smoking history of these two groups by determining the 
follows. All subjects were asked: (1) For how many years had they smoked? 
(2) How many packs per day had they smoked? (3) How many years had 
elapsed since they had quit smoking (if they had indeed quit)? (4) Did they 
only smoke cigarettes, or did they also use cigars, pipes, chewing tobacco, or 
20 marijuana? We then optimized the match between patients and controls in 
terms of the above: (1 ) similar pack-year history (2) similar time lapse since 
they had quit smoking (3) use of cigarettes exclusively. There was no 
significant difference between two groups in the smoking history (Student's t 
test P = 0.77). 

25 Blood collection and processing. 

[00176] Blood procurement procedure was approved by the institutional 
review board at UCLA and USC. Blood was drawn from control subjects and 
patients prior to treatment. The whole blood then underwent a centrifugation 
by 1,000 x gfor 10 minutes at 15°C by a Sorvall RT6000D centrifuge (DuPont, 
30 Wilmington, DE). Serum was then separated, and 100U/mL RNase inhibitor 
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(Superase-in, AtnDion Inc., Austin, I was aaaea prompxiy 10 me serum, i ne 
aliquots were stored at -80°C until further use. 

RNA isolation from serum. 

[00177] RNA was isolated from 560 jllI serum using QIAamp Viral RNA kit 
5 (Qiagen, Valencia, CA). Aliquots of isolated RNA were treated with RNase- 
free DNase (DNasel-DNA-free, Ambion Inc., Austin, TX) according to the 
manufacturer's instructions. The quality of isolated RNA was examined by RT- 
PCR for four housekeeping gene transcripts: p-actin (ACTB), p-2- 
microglobulin (B2M), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), 

10 and ribosomal protein S9 (RPS9). Based on the published sequences, 
oligonucletide primers were designed and then synthesized (Sigma Genosis, 
Woodlands, TX) for PCR. RT-PCR was performed to amplify the mRNAs' 
coding region phenotyped in 3 segments using a common upstream primer 
and three different downstream primers selected from the four housekeeping 

15 gene transcripts for RT -PCR shown in Table 4. 



Table 4 



Name 


Accession no. 
(NCBI) 


Full length (bp) 


Primer sequences 


Amplicon 
(bp) 








F: SEQ ID NO: 19 




ACTB 


X00351 


1761 


R1:SEQIDNO: 20 
R2: SEQ ID NO: 21 
R3: SEQ ID NO: 22 


195 
705 
1000 








F: SEQ ID NO: 23 




B2M 


NM_004048 


987 


R1: SEQ ID NO: 24 
R2: SEQ ID NO: 25 
R3: SEQ ID NO: 26 


216 
591 
848 








F: SEQ ID NO: 27 




GAPDH 


M33197 


1268 


R1: SEQ ID NO: 28 
R2: SEQ ID NO: 29 
R3: SEQ ID NO: 30 


140 
755 
1184 








F: SEQ ID NO: 31 




RPS9 


NM_001013 


692 


R1: SEQ ID NO: 32 
R2: SEQ ID NO: 33 
R3: SEQ ID NO: 34 


188 
426 
614 
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[00178] In particular four serum human mRNAs were selected and coding 
region phenotyped in 3 segments using a common upstream primer and three 
different downstream primers dividing the coding region approximately into 
5 three parts. 10 pi of each PCR reaction was electrophoresed on a 2% 
agarose gel and stained with EtBr. 

[00179] Specificity of all the PCR products was verified by the predicted size 
comparing the positive control (Human Salivary Gland Total RNA, Clontech, 
Palo Alto, CA). Negative controls were used in which input RNA was omitted 
10 or in which RNA was used but reverse transcriptase omitted. 

[00180] The serum phenotype of mRNA product from human was evaluated 
by RT-PCR and electrophoresis. Exemplary results reported in Figure 8 , 
showed transcripts from four housekeeping genes (ACTB, B2M, GAPDH, and 
RPS9) could be detected. In particular, amplicons for RPS9 with sizes of 188, 

15 426 and 614bp were detected (see Figure 8 lane 2, 3 and 4 respectively); 
amplicons for GAPDH with sizes of 140,755 and 1,184bp were detected (see 
Figure 8 lane 5, 6 and 7 respectively); amplicons for B2M with sizes of 
216,591 and 848bp were detected (see Figure 8 lane 8, 9 and 10 
respectively); and amplicons for ACTB with sizes of 195,705 and 1,000bp 

20 were detected (see Figure 8 lane 11,12 and 13 respectively). Controls were 
performed even if controls data are not shown in the Figure. 

[00181] The longest PCR products we amplified covered 56.8% (ACTB), 
85.9% (B2M), 93.4% (GAPDH) and 88.9% (RPS9) of the full length of the 
corresponding mRNAs, according to the NCBI GenBank database. This result 
25 also indicated there could be intact human mRNA circulating in blood in a cell- 
free form. 

Example 10: Microarray Profiling of mRNA of serum from OSCC 

PATIENTS 

[00182] Serum from ten OSCC patients (8 male, 2 female, age=51 ± 9.0) and 
30 from ten gender and age matched normal donors (age=49 ± 5.6) was 
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collected and processed as reported in txampie y rar use in microarray 
analysis. 

Microarrav analysis 

[00183] Isolated RNA from serum was subjected to linear amplification by 
5 RiboAmp™ RNA Amplification kit (Arcturus, Mountain View, CA). Following 
previously reported protocols [55], the Affymetrix Human Genome U133A 
Array, which contains 22,215 human gene cDNA probe sets representing 
-19,000 genes {i.e., each gene may be represented by more than one probe 
sets), was applied for gene expression profiling. 

10 [00184] The raw data were imported into DNA-Chip Analyzer 1.3 (dChip) 
software for normalization and model-based analysis [60]. dChip gives the 
expression index which represents the amount of mRNA/Gene expression 
and another parameter, called the present call of, whether or not the mRNA 
transcript was actually present in the sample (14). S-plus 6.0 (Insightful, 

15 Seattle, WA) was used for all statistical tests. 

[00135] Three criteria were used to determine differentially expressed genes 
between OSCCs and controls. First, genes that were assigned as "absent" 
call in all samples were excluded. Second, a two-tailed student's t test was 
used for comparison of average gene expression levels among the OSCCs 
20 (n=10) and controls (n=10). The critical alpha level of 0.05 was defined for 
statistical significance. Third, fold ratios were calculated for those genes that 
showed statistically significant difference (P < 0.05). Only those genes that 
exhibit at least 2-fold change will be included for further analysis. 

[00186] The HG U133A microarrays were used to identify the difference in 
25 salivary RNA profiles between cancer patients and matched normal subjects. 
Among the 14,268 genes included by the previously described criteria, we 
identified 335 genes with P value less than 0.05 and a fold change > 2. 
Among these genes, there are 223 up-regulated genes and 112 down- 
regulated genes in the OSCC group. According to Affymetrix, a gene that was 
30 assigned with a present call indicates this gene is reliably detected in the 
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original sample, me numoer ot genes mai were assiynesu preseiu anu uic 
present percentage on each array were shown in Table 5 reporting the human 
mRNA expression profiling in serum. 



Table 5 



Subject 


Normal 


oscc 


Gender 


Age 


Present 
Probes 3 


Probe 
P% b 


Gender 


Age 


Present 
Probes 3 


Probe 
P% b 


1 


F 


53 


1564 


7.02 


F 


55 


1990 


8.93 


2 


M 


55 


1600 


7.18 


M 


61 


2924 


13.12 


3 


M 


42 


1600 


7.18 


M 


42 


2126 


9.54 


4 


M 


46 


1716 


7.7 


M 


46 


3316 


14.88 


5 


M 


42 


1845 


8.28 


M 


42 


2937 


13.18 


6 


M 


54 


1854 


8.32 


M 


52 


1794 


8.05 


7 


F 


51 


1903 


8.54 


F 


67 


2119 


9.51 


8 


M 


48 


2032 


9.12 


M 


46 


2019 


9.06 


9 


M 


56 


1823 


8.18 


M 


61 


4646 


20.85 


10 


M 


42 


1979 


8.88 


M 


44 


2362 


10.6 


Mean± 
SD 




49± 
5.6 


1792±165 


8.04±0. 
74 




51±9. 
0 


2623±86 
8* 


11.8±3. 
90 



5 

[00187] (a) Number of probes showing present call on, HG U133A microarray 
(detection P < 0.04). 

[00188] (b) Present percentage (P%) = Number of probes assigned present 
call / Number of total probes (22,283 for HG U133A microarray). 

10 [00189] * The arrays for OSCC have significant more probes assigned with 
present call than those for control group (P < 0.002, Wilcoxon test). 

[00190] On average, there are 2623+868 probes in OSCC arrays and 
1792±165 probes in control arrays that were assigned with present calls. 
OSCC group have significant more present probes than control group (P < 
15 0.002, Wilcoxon test). 

[00191] Using a more stringent criterion that, for a certain gene, the present 
call was assigned consistently to all arrays among all cancers (n=10) or all 
controls (n=10), we identified 62 genes to be the candidates for further 
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analysis, we nuieu max xnese genes cue ctu up-reguiaiea in uouu seium, 
whereas there are no genes found down-regulated using the same filtering 
criteria. 

Example 11: Q-PCR Validation and Quantitation Analysis of 

5 MlCROARRAY PROFILING FROM CELL-FREE SALIVA OF OSCC PATIENTS 

[00192] qPCR was performed to quantify a subset of differently expressed 
transcripts in saliva and to validate the microarray findings of Example 10, on 
an enlarged sample size including saliva from 32 OSCC patients and 35 
controls. 

10 Quantitative PGR (qPCR) assay. 

[00193] Primer sets were designed by using PRIMER3 software (Table 2). 
Using MuLV reverse transcriptase (Applied Biosystems, Foster City, CA) and 
random hexamers as primer (ABI, Foster City, CA), cDNA was synthesized 
from the original and un-amplified serum RNA. The qPCR reactions ^were 

15 performed in an iCycler™ iQ real-time PGR detection system (Bio-Rad, 
Hercules, CA, USA), using iQ SYBR Green Supermix (Bio-Rad, Hercules, 
CA). All reactions were performed in triplicate with customized conditions for 
specific products. The relative amount of cDNA/RNA of a particular template 
was extrapolated from the standard curve using the LightCycler software 3.0 

20 (Bio-Rad, Hercules, CA, USA). A two-tailed student's t test was used for 
statistical analysis. 

[00194] Ten significant up-regulated genes: H3F3A, TPT1, FTH1, NCOA4, 
ARCR, THSMB (Thymosin beta 10), PRKCB1 (Protein Kinase C, beta 1), 
FTL1 (Ferritin Light polypeptide), COX4I1 (Cytochrome c oxidase subunit IV 
25 isoform 1) and SERP1 (srtress associated endoplasmic reticulum protein 1; 
ribosome associated membrane protein 4) were selected based on their 
reported cancer-association as shown in Table 6, reporting ten genes 
selected for qPCR validation. 
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Table 6 



Probe set ID 
(HG U133A} 


Gene name 


Symbol 


Accession No. 


qPCRP 
(t testt 


211940_x_at 


H3 histone, family 3A 


H3F3A 


BE869922 


0.003 


211943_x_at 


Tumor protein, 
translationally-controlled 1 


TPT1 


, AL565449 


0.005 


200748_s__at 


Ferritin, heavy polypeptide 
1 


FTH1 


NM_002032 


0.008 


210774_s_at 


Nuclear receptor 
coactivator 4 


NCOA4 


AL1 62047 


0.021 


200059_s_at 


Ras homolog gene family, 
member A 


ARCR 


BC001360 


0.048 


217733_s_at 


Thymosin, beta 10 


THSMB 


NM_021103 


0.318 


209685_s_at 


Protein kinase C, beta 1 


PRKCB1 


M 13975 


0.615 


208755_x_at 


Ferritin, light polypeptide 


FTL1 


BF312331 


0.651 


200086_s_at 


Cytochrome c oxidase 
subunit IV isoform 1 


COX4I1 


AA854966 


0.688 


200971_s_at 


Stress-associated 
endoplasmic reticulum 

protein 1; ribosome 
associated membrane 
protein 4 


SERP1 


NM_014445 


0.868 



[00195] Table 6 presents their quantitative alterations in serum from OSCC 
patients, determined by qPCR. The results confirmed that transcripts of 
H3F3A, TPT1, FTH1, NCOA4 and ARCR were significantly elevated in the 



5 saliva of OSCC patient (Wilcoxon Signed Rank test, P < 0.05). We did not 
detect the statistically significant differences in the amount of the other five 
transcripts by qPCR. 
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EXAMPLE 12: ROC AND SENSITIVITY/SPECIFICITY ANALYSIS 

[00196] Statistical analysis of the data collected in outcome of the 
experiments reported on Examples 9 to 11 above demonstrates the specificity 
and sensitivity of these biomarkers for H NSCC, and their predictive value. 

5 Receiver Operating Characteristic Curve Analysis and Prediction Models. 

[00197] Utilizing the qPCR results, multivariate classification models were 
constructed to determine the best combination of the selected serum 
transcripts for cancer prediction. Firstly, using the binary outcome of the 
disease (OSCC) and non-disease (normal) as dependent variables, a logistic 
10 regression model was constructed [61]. Age, gender and smoking history are 
controlled in the data collection procedure. 

[00198] Leave-one out cross validation was used to validate the logistic 
regression model. The cross validation strategy first removes one observation 
and then fits a logistic regression mod^l from the remaining cases using all 
15 markers. Stepwise model selection is used; for each of these models to 
remove variables that do not improve the model. Subsequently, the observing 
values for the case that was left out were used to compute a predicted class 
for that observation. The cross validation error rate is then the number of 
samples predicted incorrectly divided by the number of samples. 

20 [00199] The Receiver operating characteristic (ROC) curve analysis was then 
computed for the best final logistic model (S-plus 6.0), using the fitted 
probabilities from the model as possible cut-points for computation of 
sensitivity and specificity. Area under the curve was computed via numerical 
integration of the ROC curve. 

25 [00200] To demonstrate the utility of circulating mRNAs in serum for OSCC 
discrimination, two classification/prediction models were observed. Using the 
qPCR data, a logistic regression model was built compose of six serum 
transcripts previously examined, ARHA, FTH1, H3F3A, TPT1, COX4I1 and 
FTL1. Those six transcripts in combination provided the best prediction, which 

30 was then validated by the leaving-one-out validation. Out of 67 leaving-one- 
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out trial, 04 (BiTbj or trie Dest logistic moaeis was rouna to me same muut?i an 
the one from the whole data and the validation error rate was 31 .3% (21/67). 

[00201] Results are reported in Figure 9 , wherein the ROC curve computed 
for this logistic regression model is shown. 

5 [00202] Using a cut-off probability of 44% a sensitivity of 84% and a 
specificity of 83% were obtained. The final model predicts correctly for 56 
(83.5%) subjects out of 67 with 0.84 (27/32) sensitivity and 0.83 (29/35) 
specificity and it misclassifies 6 subjects for control and 5 for OSCC. The 
calculated area under the ROC curve was 0.88 for this logistic regression 
10 model. 

Tree-based classification model, classification and regression tree (CART), 

[00203] Secondly, another prediction model utilizing the qPCR results was 
built by a tree-based classification method. The classification and regression 
trees (CART), was constructed by S-plus 6.0 using the serum transcripts as 

15 predictors from qPCR result. CART fits the classification model by binary 
recursive partitioning, where each step involves searching for the predictor 
variable that results in the best split of the cancer versus the normal groups 
[62]. CART used the entropy function with splitting criteria determined by 
default settings for S-plus. By this approach, the parent group containing the 

20 entire samples (n=67) was subsequently divided into cancer groups and 
normal groups. Our initial tree was pruned to remove all splits that did not 
result in sub-branches with different classifications. 

[00204] A second model, the "classification and regression trees (CART) 
model", was generated according to the diagram reported in Figure 10 . 

25 [00205] Our fitted CART model used the serum mRNA concentrations of 
THSMB and FTH1 as predictor variables for OSCC. THSMB, chosen as the 
initial split, with a threshold of 4.59E-17 M, produced two child groups from 
the parent group containing the total 67 samples. 47 samples with the THSMB 
concentration < 4.59E-17 M were assigned into "Normal-1", while 20 with 

30 THSMB concentration > 4.59E-17 M were assigned into "CanceM". The 
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"Normai-r group was turther partitionea oy urn witn a inresnoia or o.^tz- 
16 M. The resulting subgroups, "Normal-2" contained 28 samples with FTH1 
concentration < 8.44E-16 M, and "Cancer-2" contained 19 samples with FTH1 
concentration > 8.44E-16 M. Consequently, the 67 serum samples involved in 
5 our study were classified into the "Normal" group and the "Cancer" group by 
CART analysis. 

[00206] The "Normal" group was composed of the samples from "Normal-2" 
which included a total of 28 samples, 25 from normal subjects and 3 from 
cancer patients. Thus, by using the combination of THSMB and FTH1 for 

10 OSCC prediction, the overall specificity is 78% (25/35). The "Cancer" group 
was composed of the samples from "Cancer-1" and "Cancer-2". There are a 
total of 39 samples assigned in the final "Cancer" group, 29 from cancer 
patients and 10 from normal subjects. Therefore, by using the combination of 
these two serum mRNA for OSCC prediction, the overall sensitivity is 91% 

15 (29/32, in cancer group) and specificity is 78% (25/35, in normal group). 

EXAMPLE 13: RNA ISOLATION, AMPLIFICATION AIMD GENE EXPRESSION 
PROFILING FROM SALIVA OF OSCC PATIENTS 

Patient Selection. 

[00207] OSCC patients were recruited from Medical Centers at University of 
20 California, Los Angeles (UCLA); University of Southern California (USC), Los 
Angeles, CA; and University of California San Francisco, San Francisco, CA. 

[00208] Thirty-two patients with documented primary T1 or T2 OSCC were 
included. All of the patients had recently received diagnoses of primary 
disease and had not received any prior treatment in the form of 
25 chemotherapy, radiotherapy, surgery, or alternative remedies. 

[00209] An equal number of age- and sex-matched subjects with comparable 
smoking histories were selected as a control group. Among the two subject 
groups, there were no significant differences in terms of mean age: OSCC 
patients, 49.8 ± 7.6 years; normal subjects, 49.1 ± 5.9 years (Student's t test, 
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F > u.BU); gender {H > U.yu); or smoKing nistory {f > u.ro). ino suojeuis nau a 
history of prior malignancy, immunodeficiency, autoimmune disorders, 
hepatitis, or HIV infection. All of the subjects signed the institutional review 
board-approved consent form agreeing to serve as saliva donors for the 
5 experiments. 

Saliva Collection and RNA Isolation. 

[00210] Unstimulated saliva samples were collected between 9 a.m. and 10 
a.m. with previously established protocols [38]. Subjects were asked to refrain 
from eating, drinking, smoking, or oral hygiene procedures for at least 1 hour 
10 before the collection. Saliva samples were centrifuged at 2,600 xg for 15 
minutes at 4°C. 

[00211] The supernatant was removed from the pellet and treated with 
RNase inhibitor (Superase-In, Ambion Inc., Austin, TX). RNA was isolated 
from 560 \xL of saliva supernatant with QIAamp Viral RNA kit (Qiagen, 

15 Valencia, CA). Aliquots of isolated RNA were treated with RNase-free DNase 
(DNasel-DNA-free, Ambion Inc.) according to the manufacturer's instructions. 
The quality of isolated RNA was examined by RT-PCR for three cellular 
maintenance gene transcripts: glyceraidehyde-3-phosphate dehydrogenase 
(GAPDH), actin-|304C7B), and ribosomal protein S9 (RPS9). Only those 

20 samples exhibiting PCR products for all three mRNAs were used for 
subsequent analysis. 

[00212] On average, 54.2 ± 20.1 ng (n = 64) of total RNA was obtained from 
560 j^L of saliva supernatant. There was no significant difference in total RNA 
quantity between the OSCC and matched controls (t test, P = 0.29, n= 64). 
25 RT-PCR results demonstrated that all of the saliva samples (» = 64) contained 
transcripts from three genes {GAPDH, ACTS, and RPS9), which were used as 
quality controls for human salivary RNAs [55]. A consistent amplifying 
magnitude (658 ± 47.2, n = 5) could be obtained after two rounds of RNA 
amplification. On average, the yield of biotinylated cRNA was 39.3 ±6.0 jag (n 
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== 4£U). mere were no signmcanx currerences ot tne ckinm quantity yieiueu 
between the OSCC and the controls (t test, P = 0.31 = 20). 

Example 14: Microarray Profiling of mRNA of saliva from OSCC 

PATIENTS 

5 [00213] Saliva from 10 OSCC patients (7 male, 3 female; age, 52 ± 9.0 
years) and from 10 gender- and age-matched normal donors (age, 49 ± 5.6 
years) was used for a microarray study. Isolated RNA from saliva was 
subjected to linear amplification by RiboAmp RNA Amplification kit (Arcturus, 
Mountain View, CA). The RNA amplification efficiency was measured by using 

10 control RNA of known quantity (0.1 \ig) running in parallel with the 20 samples 
in five independent runs. 

Microarray Analysis. 

[00214] Following previously reported protocols [55], the Human Genome 
U133A Array (HG U133A, Affymetrix, Santa Clara, CA) was applied for gene 
15 expression analysis. The arrays were scanned and the fluorescence intensity 
was measured by Microarray Suit 5.0 software (Affymetrix, Santa Clara, CA); 
the arrays were then imported into DNA-Chip Analyzer software (http: 
www.dchp.org) for normalization and model-based analysis [60]. S-plus 6.0 
(Insightful, Seattle, WA) was used to carry out all statistical tests. 

20 [00215] Three criteria were used to determine differentially expressed gene 
transcripts. First, probe sets on the array that were assigned as "absent" call 
in all samples were excluded. Second, a two-tailed Student's t test was used 
for comparison of average gene expression signal intensity between the 
OSCCs (n = 10) and controls (n = 10). The critical level of 0.05 was defined 

25 for statistical significance. Third, fold ratios were calculated for those gene 
transcripts that showed statistically significant difference (P < 0.05). Only 
those gene transcripts that exhibited at least 2-fold change were included for 
further analysis. 

[00216] The HG U133A microarrays were used to identify the difference in 
30 salivary RNA profiles between cancer patients and matched normal subjects. 
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Among tne iu,cno transcripts inciuaea oy tne previously uesunueu uiueiia, we 
identified 1,679 transcripts with P value less than 0.05. Among these 
transcripts, 836 were up-regulated and 843 were down-regulated in the OSCC 
group. These transcripts observed were unlikely to be attributable to chance 
alone (2 test, P < 0.0001), considering the false positives with P < 0.05. Using 
a predefined criteria of a change in regulation >3-fold in all 10 OSCC saliva 
specimens and a cutoff of P value < 0.01, 17 mRNA, were identified showing 
significant up-regulation in OSCC saliva. 17 transcripts showed a change in 
regulation >3-fold in all 10 OSCC saliva specimens, and a more stringent 
cutoff of P value < 0.01 . It should be noted that these 17 salivary mRINA are 
all up-regulated in OSCC saliva, whereas there are no mRNAs found down- 
regulated with the same filtering criteria. The biological functions or these 
genes and their products are presented in Table 7 showing Salivary mRNA 
up-regulated (>3-fold, P < 0.01) in OSCC identified by microarray 
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Table 7 



Gene 
symbol 


bene name 


f^nn Don!/ 

accession 
No. 




Gene functions 


B2M _ 


(3-2-microglobulin 


NM_04U4o 


1 oqz l- 
q22.2 


Mi lUdpupLUolo, allliytSM 

presentation 


DUSP1 


Dual specificity 
phosphatase 1 


NMJ34417 


5q34 


Protein modification; signal 

LI CXI loUUUUUI 1 UAIUaLI VC OLI COO, 


FTH1 


Ferritin, heavy 
polypeptide 1 


NMJJ2032 


11q13 


Iron ion transport; cell 
nrolifpration 


G0S2 


Putative 
lymphocyte <jU-Cj1 
switch gene 


NMJ)15714 


1q32.2- 
q 4 * i 


Cell growth and/or 

mo intp»ni3 nr*<^* r^m llatinn of ppII 
1 1 lean nt?i lai IL/C, i cy uidiiui i vj i oc?u 

cycle 


GADD45 
B 


Growth arrest and 
DNA-damage- 
inducible (3 


NM_015675 


19p13.3 


Kinase cascade; apoptosis 


H3F3A 


H3 histone, family 
3A 


BE869922 


1q41 


DNA binding activity 


HSPC016 


Hypothetical 
protein HSPC016 


BG 167522 


3p21.31 


Unknown 


IER3 


Immediate early 
response 3 


NM_003897 


6p21.3 


Embryogenesis; 
morphogenesis; apoptosis; cell 
growth and maintenance 


IL1B 


Interleukin 1(3 


M15330 


2q14 


Signal transduction; 
proliferation; inflammation 

apoptosis 

, ■» 


IL8 


Interleukin 8 


NM_000584 


4q13-q21 


Angiogenesis; replication; 
calcium-mediated signaling 

pathway; cell adhesion; 
chemotaxis cell cycle arrest; 

IIIIIIIUIIC 1 copui >oo 


MAP2K3 


Mitogen-activated 
protein kinase 
kinase 3 


AA780381 


17q11.2 


Signal transduction; protein 
modification 


OAZ1 


Ornithine 
decarboxylase 
antizyme 1 


D87914 


19p13.3 


Poiyamine biosynthesis 


PRG1 


Proteoglycan 1 , 
secretory granule 


NM_002727 


10q22.1 


Proteoglycan 


RGS2 


Regulator of G- 

piuifcJin oiyi icuii iy 

2, 24 kda 


NM_002923 


1q31 


Oncogenesis; G-protein signal 
transduction 


S100P 


S100 calcium 
binding protein P 


NM_005980 


4p16 


Protein binding; calcium ion 
binding 


SAT 


Spermidine/spermi 

neN1- 
acetyltransferase 


NMJ302970 


Xp22.1 


Enzyme, transferase activity 


EST 


highly similar 
ferritin light chain 


BG537190 




Iron ion homeostasis, ferritin 
complex 



52 



WO 2005/081867 



PCT/US2005/005263 



[uuzi/j me numan uenome ui^csa microarrays were usea xo lueruuy me 
difference in RNA expression patterns in saiiva from 10 cancer patients and 
10 matched normal subjects. Using a criteria of a change in regulation >3-fold 
in all 10 OSCC saliva specimens and a cutoff of P value < 0.01, we identified 
5 17 mRNA, showing significant up-regulation in OSCC saliva. 

Example 15: Q-PCR Validation and Quantitation Analysis of 

MlCROARRAY PROFILING FROM CELL-FREE SALIVA OF OSCC PATIENTS 

[00218] Quantitative polymerase chain reaction (qPCR) was performed to 
validate a subset of differently expressed transcripts identified by the 
10 microarray analysis of Example 14. 

Quantitative Polymerase Chain Reaction Validation. 

[00219] cDNA from the original and unamplified salivary RNA. was 
synthesized Using MuLV reverse transcriptase (Applied Biosystems, Foster 
15 City, CA) and random hexamers as primer (Applied Biosystems). The qPCR 
reactions were performed in an iCycler PGR system with iQ SYBR Green 
Supermix (Bio-Rad, Hercules, CA). Primer sets were designed by using 
PRIMERS software (http://www.genome.wi.mit.edu). 

[00220] All of the reactions were performed in triplicate with customized 
20 conditions for specific products. The initial amount of cDNA/RNA of a 
particular template was extrapolated from the standard curve as described 
previously [32]. This validation completed by testing all of the samples (ji= 64) 
including those 20 previously used for microarray study. Wilcoxon Signed- 
Rank test was used for statistical analysis. 

25 [00221] Quantitative PGR was performed to validate the microarray findings 
on an enlarged sample size including saliva from 32 OSCC patients and 32 
matched controls. Nine candidates of salivary mRNA biomarkers: DUSP1, 
GADD45B, H3F3A, IL1B, IL8, OAZ1, RGS2, S100P, and SATwere selected based 
on their reported cancer association reported in Table 7. Table 8 presents the 
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quantitative alterations or the aoove nine canaiaaies in sanva Trom uouu 
patients, determined byqPCR. 

Table 8 



Gene 
symbol 


Primer sequence (5' to 3') 


Validated 

* 


P 

value 


Mean 
fold 
increas 
e 


DUSP1 


F: SEQ ID NO: 35 


Yes 


0.039 


2.60 


R: SEQ ID NO: 36 


H3F3A 


F: SEQ ID NO: 37 


Yes 


0.011 


5.61 


R: SEQ ID NO: 38 


IL1B 


F: SEQ ID NO: 39 


Yes 


0.005 


5.48 


R: SEQ ID NO: 40 


IL8 


F: SEQ ID NO: 41 


Yes 


0.000 


24.3 


R: SEQ ID NO: 42 


OAZ1 


F: SEQ ID NO: 43 


Yes 


0.009 


2.82 


R: SEQ ID NO: 44 


S100P 


F: SEQ ID NO: 45 


Yes 


0.003 


4.88 


R: SEQ ID NO: 46 


SAT 


F: SEQ ID NO: 47 ; 


Yes 


0.005 


2.98 


R: SEQ ID NO: 48 


GADD4 
5B 


F: SEQ ID NO: 49 


No 


0.116 




R: SEQ ID NO: 50 


RGS2 


F: SEQ ID NO: 51 


No 


0.149 




R: SEQ ID NO: 52 



Seven of the nine potential candidate were validated by qPCR (P < 0.05). * 



5 Wilcoxon's Signed Rank test: if P < 0.05, validated (Yes); if P > 0.05, not 
validated (No) 

[00222] The results confirmed that transcripts of 7 of the 9 candidate mRNA 
(78%), DUSP1, H3F3A, IL1B, IL8,OA21, S100P, and SAT, were significantly 
elevated in the saliva of OSCC patient (Wilcoxon Signed-Rank test, P <0.05). 
10 We did not detect the statistically significant differences in the amount of 
RGS2 (P = 0.149) and GADD45B (P = 0.116) by qPCR. The validated seven 
genes could be classified in three ranks by the magnitude of increase: high 
up-regulated mRNA including IL8 (24.3-fold); moderate up-regulated mRNAs 
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including n^t-c5M ^o.tn-TOiaj, iL'it* (o.^o), anu oiuur ^.oo-roia;; a\\u iuw up- 
regulated mRNAs including DUSP1 (2.60-fold), OAZ1 (2.82-fold), and SAT 
(2.98-fold). 

5 Example 16: ROC and Sensitivity/specificity Analysis 

[00223] Using the qPCR results, Receiver Operating Characteristic (ROC) 
curve analyses was performed [82] by S-plus 6.0 to evaluate the predictive 
power of each of the biomarkers identified in the Example 15. 

Receiver Operating Characteristic Curve Analysis and Prediction Models. 

10 [00224] The optimal cutpoint was determined for each biomarker by 
searching for those that yielded the maximum corresponding sensitivity and 
specificity. ROC curves were then plotted on the basis of the set of optimal 
sensitivity and specificity values. Area under the curve was computed via 
numerical integration of the ROC curves. The biomarker that has the largest 

15 area under the ROC curve was identified as having the strongest predictive 
power for detecting OSCC. 

[00225] Next, multivariate classification models were constructed to 
determine the best combination of salivary markers for cancer prediction. 
Firstly, using the binary outcome of the disease (OSCC) and nondisease 
20 (normal) as dependent variables, we constructed a logistic regression model 
controlling for patient age, gender, and smoking history. The backward 
stepwise regression [61] was used to find the best final model. 

[00226] Leave-one-out cross-validation was used to validate the logistic 
regression model. The cross-validation strategy first removes one observation 

25 and then fits a logistic regression model from the remaining cases with all of 
the markers. Stepwise model selection is used for each of these models to 
remove variables that do not improve the model. Subsequently, the marker 
values were used for the case that was left out to compute a predicted class 
for that observation. The cross-validation error rate is then the number of 

30 samples predicted incorrectly divided by the number of samples. 
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Luuzz/j me kuu curve, mustraiea in riyure i i, was men uornpuieu iui uic 
logistic model by a similar procedure, with the fitted probabilities from the 
model as possible cutpoints for computation of sensitivity and specificity. 

[00228] The detailed statistics of the area under the receiver operator 
5 characteristics (ROC) curves, the threshold values, and the corresponding 
sensitivities and specificities for each of the seven potential salivary mRNA 
biomarkers for OSCC are listed in Table 9 showing the ROC curve analysis of 
OSCC-associated salivary mRNA biomarkers 

10 Table 9 



Biomarker 


Area under 
ROC curve 


Threshold/cutoff 
(M) 


Sensitivity 
(%) 


Specificity 
(%) 


DUSP1 


0.65 


8.35E-17 


59 


75 


H3F3A 


0.68 


1.58E-15 


53 


81 


IL1B 


0.70 


4.34E-16 


63 


72 


IL8 


0.85 


3.19E-18 


88 


81 


OAZ1 


0.69 


7.42E-17 


100 


38 


SI OOP 


0.71 


2.11E-15 


72 


63 


SAT 


0.70 


1.56E-15 


81 


56 



[00229] Utilizing the qPCR results, we conducted ROC curve analyses to 
evaluate the predictive power of each of the biomarkers. The optimal cutpoint 
was determined yielding the maximum corresponding sensitivity and 
15 specificity. The biomarker that has the largest area under the ROC curve was 
identified as having the strongest predictive power for detecting OSCC. 

[00230] The data showed IL8 mRNA performed the best among the seven 
potential biomarkers for predicting the presence of OSCC. The calculated 
area under the ROC curve for IL8 was 0.85. With a threshold value of 3.19E - 
20 18 mol/L, IL8 mRNA in saliva yields a sensitivity of 88% and a specificity of 
81% to distinguish OSCC from the normal. 

[00231] To demonstrate the utility of salivary mRNAs for disease 
discrimination, two classification/prediction models were examined. A logistic 
regression model was built based on the four of the seven validated 
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DiomarKers, ilid, uaz.i, cdmi, anu ilo, wmium m uumuiiicuiuii \j\ uviucu uic 
best prediction (Table 10). Table 10 shows salivary for OSCC selected by 
logistic regression model 

5 Table 10 



Biomarker 


Coefficient value 


SE 


P value 


Intercept 


-4.79 


1.51 


0.001 


IL1B 


5.10E +19 


2.68E+19 


0.062 


OAZ1 


2.18E+20 


1.08E+20 


0.048 


SAT 


2.63E+19 


1.10E+19 


0.020 


IL8 


1.36E+17 


4.75E+16 


0.006 



[00232] The logistic regression model was built based on the four of seven 
validated biomarkers (IL1B, OAZ1, SAT, and IL8) that, in combination, 
provided the best prediction. The coefficient values are positive for these four 
10 markers, indicating that the synchronized increase in their concentrations in 
saliva increased the probability that the sample was obtained from an OSCC 
subject. 

[00233] The coefficient values are positive for these four markers, indicating 
that the synchronized rise in their concentrations in saliva increased the 
15 probability that the sample was obtained from an OSCC subject. The leave- 
one-out cross-validation error rate based on logistic regression models was 
19% (12 of 64). All but one (of the 64) of the models generated in the leave- 
one-out analysis used the same set of four markers found to be significant in 
the full data model specified in Table 10. 

20 [00234] The ROC curve was computed for the logistic regression model. 
Using a cutoff probability of 50%, we obtained a sensitivity of 91% and a 
specificity of 91%. The calculated area under the ROC curve was 0.95 for the 
logistic regression model (Fig. 11). 

Tree-based classification model, classification and regression tree (CART), 

25 [00235] A second model, a tree-based classification model, classification and 
regression tree (CART) model," was generated. The CART model was 
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constructed t>y s-pius b.U witn tne vanaaiea uikna DiomarKers as preuiciuis. 
CART fits the classification model by binary recursive partitioning, in which 
each step involves searching for the predictor variable that results in the best 
split of the cancer versus the normal groups [62]. CART used the entropy 
5 function with splitting criteria determined by default settings for S-plus. By this 
approach, the parent group containing the entire samples (n = 64) was 
subsequently divided into cancer groups and normal groups. Our initial tree 
was pruned to remove all splits that did not result in sub-branches with 
different classifications. 

10 [00236] Results are shown in the diagram of Fig. 12. Our fitted CART model 
used the salivary mRNA concentrations of IL8, H3F3A, and SAT as predictor 
variables for OSCC. IL8, chosen as the initial split, with a threshold of 3.14E _ 
18 mol/L, produced two child groups from the parent group containing the 
total 64 samples. 30 samples with the IL8 concentration <3.14E -18 mol/L 

15 were assigned into "Normal-1," whereas 34 with IL8 concentration > 3.14E - 
18 were assigned into "Cancer-T. The "Normal-1" group was further 
partitioned by SATW\\X\ a threshold of 1 .13E - 14 mol/L. 

[00237] The resulting subgroups, "Normal-2" contained 25 samples with SAT 
concentration <1.13E - 14 mol/L, and "Cancer-2" contained 5 samples with 
20 SAT concentration >1.13E - 14 mol/L. Similarly, the "Cancer-1" group was 
further partitioned by H3F3A with a threshold of 2.07E - 16 mol/L. The 
resulting subgroups, "Cancer-3" contained 27 samples with H3F3A 
concentration >2.07E - 16 mol/L, and "Normal-3" group contained 7 samples 
with H3F3A concentration <2.07E - 16 mol/L. 

25 [00238] Consequently, the 64 saliva samples involved in our study were 
classified into the "Cancer" group and the "Normal" group by CART analysis. 
The "Normal" group was composed of the samples from "Normal-2" and those 
from "Normal-3". There are a total of 32 samples assigned in the "Normal" 
group, 29 from normal subjects and 3 from cancer patients. 

30 [00239] Thus, by using the combination of IL8, SAT, and H3F3A for OSCC 
prediction, the overall sensitivity is 90.6% (29 of 32). The "Cancer" group was 
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composed or me samples Trom uancer-z ana oancer-o. inert? cue c* iulcu 
of 32 samples assigned in the final "Cancer" group, 29 from cancer patients 
and 3 from normal subjects. Therefore, by using the combination of these 
three salivary mRNA biomarkers for OSCC prediction, the overall specificity is 
5 90.6% (29 of 32). 

[00240] In summary the present disclosure refers to a method to detect a 
biomarker in saliva wherein the biomarker is an extracellular mRNA, 
comprises detecting the extracellular mRNA in the cell-free saliva; 
transcriptome analysis of saliva comprises detecting a transcriptome pattern 

10 in the cell-free saliva; a method to detect genetic alterations in an organ or in 
a gene in the organ by analyzing saliva, comprises detecting a transcriptome 
pattern and/or the mRNA profiling of the gene in cell-free saliva; a method to 
diagnose an oral or systemic pathology disease or disorder in a subject, 
comprises: detecting profile of a biomarker associated with the pathology 

15 disease or disorder, in particular mRNA and/or protein, in cell-free saliva 
and/or serum; kits comprising identifier for at least one biomarker for 
performing at least one of the methods; and use of salivary biomarker salivary 
and/or serum mRNAs as biomarkers for oral and/or systemic pathology, 
disease or disorder. 

20 [00241] The disclosures of each and every publication and reference cited 
herein are hereby incorporated herein by reference in their entirety. 

[00242] The present disclosure has been explained with reference to specific 
embodiments. Other embodiments will be apparent to those of ordinary skill 
in the art in view of the foregoing description. The scope of protection of the 
25 present disclosure is defined by the appended claims. 
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